{ "info": { "author": "Taha Zerrouki", "author_email": "taha.zerrouki@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "Natural Language :: Arabic", "Operating System :: OS Independent", "Programming Language :: Python", "Topic :: Text Processing :: Linguistic" ], "description": "Tashaphyne\n==========\n\n**Tashaphyne**: Arabic Light Stemmer \u062a\u0627\u0634\u0641\u064a\u0646: \u0627\u0644\u062a\u062c\u0630\u064a\u0639 \u0627\u0644\u062e\u0641\u064a\u0641 \u0644\u0644\u0646\u0635\u0648\u0635\n\u0627\u0644\u0639\u0631\u0628\u064a\u0629\n\n**Tashaphyne** is an Arabic light stemmer and segmentor. It mainly\nsupports light stemming (removing prefixes and suffixes) and give all\npossible segmentations. It use a modified finite state Automaton which\nallow to generate all segmentations.\n\nIt offers stemming and root extraction at the same time unlike Khoja\nstemmer, ISRI semmer, Assem stemmer, Farasa stemmer.\n\n**Tashaphyne** comes with default prefixes and suffixes, and accept the\nuse of customized prefixes and suffixes list, which allow it to handle\nmore aspect, and make customized stemmers without changing code.\n\n**Tashaphyne** is a python library, it's available as demo on\n`Mishkal `__, choose Tools/Analysis and as\nsource code on `Github `__\n\nDeveloppers: Taha Zerrouki: http://tahadz.com taha dot zerrouki at gmail\ndot com\n\n+---------+------------------------------------------------------------------+\n| Feature | value |\n| s | |\n+=========+==================================================================+\n| Authors | `Authors.md `__ |\n+---------+------------------------------------------------------------------+\n| Release | 0.3 |\n+---------+------------------------------------------------------------------+\n| License | `GPL `_ |\n| | _ |\n+---------+------------------------------------------------------------------+\n| Tracker | `linuxscout/tashaphyne/Issues `__ |\n+---------+------------------------------------------------------------------+\n| Website | https://pypi.python.org/pypi/Tashaphyne |\n+---------+------------------------------------------------------------------+\n| Doc | `package Documentaion `__ |\n+---------+------------------------------------------------------------------+\n| Source | `Github `__ |\n+---------+------------------------------------------------------------------+\n| Downloa | `sourceforge `__ |\n| d | |\n+---------+------------------------------------------------------------------+\n| Feedbac | `Comments `__ |\n| ks | |\n+---------+------------------------------------------------------------------+\n| Account | [@Twitter](https://twitter.com/linuxscout) |\n| s | [@Sourceforge](http://sourceforge.net/projects/tashaphyne/) |\n+---------+------------------------------------------------------------------+\n\nCitation\n--------\n\nIf you would cite it in academic work, can you use this citation\n\n::\n\n T. Zerrouki\u200f, Tashaphyne, Arabic light stemmer\u200f, https://pypi.python.org/pypi/Tashaphyne/0.2\n\n\n\n\u0645\u0632\u0627\u064a\u0627\n-----\n\n- \u062a\u062c\u0630\u064a\u0639 \u0627\u0644\u0643\u0644\u0645\u0629 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0625\u0644\u0649 \u0623\u0628\u0633\u0637 \u062c\u0630\u0639 \u0645\u0645\u0643\u0646\n- \u0625\u0645\u0643\u0627\u0646\u064a\u0629 \u0627\u0633\u062a\u062e\u0631\u0627\u062c \u0627\u0644\u062c\u0630\u0631\n- \u062a\u0642\u0637\u064a\u0639 \u0627\u0644\u0643\u0644\u0645\u0629 \u0625\u0644\u0649 \u062c\u0645\u064a\u0639 \u0627\u0644\u062d\u0627\u0644\u0627\u062a \u0627\u0644\u0645\u0645\u0643\u0646\u0629.\n- \u062a\u0646\u0645\u064a\u0637 \u0627\u0644\u0643\u0644\u0645\u0629 ( \u062a\u0648\u062d\u064a\u062f \u0627\u0644\u062d\u0631\u0648\u0641 \u0630\u0627\u062a \u0627\u0644\u0623\u0634\u0643\u0627\u0644 \u0627\u0644\u0645\u062e\u062a\u0644\u0641\u0629.\n- \u0642\u0627\u0626\u0645\u0629 \u0645\u0633\u0628\u0642\u0629 \u0644\u0644\u0632\u0648\u0627\u0626\u062f \u0627\u0644\u0639\u0631\u0628\u064a\u0629\u060c \u0648\u062d\u0631\u0648\u0641 \u0627\u0644\u0632\u064a\u0627\u062f\u0629 -\u0625\u0645\u0643\u0627\u0646\u064a\u0629 \u0636\u0628\u0637 \u0625\u0639\u062f\u0627\u062f\u0627\u062a\n \u0627\u0644\u0645\u062c\u0630\u0639 \u0648\u0627\u0644\u0645\u0642\u0637\u0639\u060c \u0645\u0646 \u062e\u0644\u0627\u0644 \u062a\u0639\u062f\u064a\u0644 \u0642\u0648\u0627\u0626\u0645 \u0627\u0644\u0632\u0648\u0627\u0626\u062f.\n\nFeatures\n--------\n\n- Arabic word Light Stemming.\n- Root Extraction.\n- Word Segmentation\n- Word normalization\n- Default Arabic Affixes list.\n- An customizable Light stemmer: possibility of change stemmer options\n and data.\n- Data independent stemmer.\n\nApplications\n============\n\n- Stemming texts\n- Text Classification and categorization\n- Sentiment Analysis\n- Named Entities Recognition\n\nInstallation\n============\n\n::\n\n pip install tashaphyne\n\nUsage\n=====\n\nTahsphyne is a finite state automaton stemmed based, it extract affixes\n(prefixes and suffixes), with a predefined affixes list.\n\nIt extract all possible affixation from a word and cite all possible\nconfiguration stemming of a given word.\n\nFunctions \u0627\u0644\u062f\u0648\u0627\u0644\n----------------\n\n- \u062a\u062c\u0630\u064a\u0639 \u0627\u0644\u0643\u0644\u0645\u0629\n\n\u062a\u062c\u0630\u064a\u0639 \u0627\u0644\u0643\u0644\u0645\u0629 \u0648\u0627\u0633\u062a\u062e\u0644\u0627\u0635 \u0643\u0644 \u0627\u0644\u0645\u0639\u0644\u0648\u0645\u0627\u062a \u0645\u0646\u0647\u0627 \u0628\u0648\u0627\u0633\u0637\u0629 \u0627\u0644\u062f\u0648\u0627\u0644 \u0627\u0644\u0645\u0646\u0627\u0633\u0628\u0629\n\nStemming function, stem an arabic word, and return a stem. This function\nstore in the instance the stemming positions (left, right), then it's\npossible to get other calculted attributs like : stem, prefixe, suffixe,\nroot.\n\n.. code:: python\n\n >>> #make propre display for unicode\n ... import pyarabic.arabrepr\n >>> arepr = pyarabic.arabrepr.ArabicRepr()\n >>> repr = arepr.repr\n >>> \n >>> from tashaphyne.stemming import ArabicLightStemmer\n >>> ArListem = ArabicLightStemmer()\n >>> word = u'\u0623\u0641\u062a\u0636\u0627\u0631\u0628\u0627\u0646\u0646\u064a'\n >>> # stemming word\n ... stem = ArListem.light_stem(word)\n >>> # extract stem\n ... print ArListem.get_stem()\n \u0636\u0627\u0631\u0628\n >>> # extract root\n ... print ArListem.get_root()\n \u0636\u0631\u0628\n >>> \n >>> # get prefix position index\n ... print ArListem.get_left()\n 3\n >>> # get prefix \n ... print ArListem.get_prefix() \n \u0623\u0641\u062a\n >>> # get prefix with a specific index\n ... print ArListem.get_prefix(2) \n \u0623\u0641\n >>> \n >>> # get suffix position index\n ... print ArListem.get_right()\n 7\n >>> # get suffix \n ... print ArListem.get_suffix() \n \u0627\u0646\u0646\u064a\n >>> # get suffix with a specific index\n ... print ArListem.get_suffix(10) \n \u064a\n >>> # get affix\n >>> print ArListem.get_affix()\n \u0623\u0641\u062a-\u0627\u0646\u0646\u064a\n >>> # get affix tuple\n ... print repr(ArListem.get_affix_tuple()) \n {'prefix': u'\u0623\u0641\u062a', 'root': u'', 'stem': u'', 'suffix': u'\u0623\u0641\u062a\u0636\u0627\u0631\u0628\u0627\u0646\u0646\u064a'}\n >>> # star words\n ... print ArListem.get_starword()\n \u0623\u0641\u062a*\u0627**\u0627\u0646\u0646\u064a\n >>> # get star stem\n ... print ArListem.get_starstem()\n *\u0627**\n >>> \n >>> # get unvocalized word\n ... print ArListem.get_unvocalized()\n \u0623\u0641\u062a\u0636\u0627\u0631\u0628\u0627\u0646\u0646\u064a\n\n+------------+----------------+-------+\n| function | Description | \u0648\u0635\u0641 |\n+============+================+=======+\n| get\\_root( | Get the root | \u0627\u0633\u062a\u062e\u0644 |\n| ) | of the treated | \u0627\u0635 |\n| | word by the | \u0627\u0644\u062c\u0630\u0631 |\n| | stemmer. | |\n+------------+----------------+-------+\n| get\\_stem( | Get the stem | \u0627\u0633\u062a\u062e\u0644 |\n| ) | of the treated | \u0627\u0635 |\n| | word by the | \u0627\u0644\u062c\u0630\u0639 |\n| | stemmer. | \u064a\u0645\u0643\u0646 |\n| | | \u0627\u0633\u062a\u062e\u0644 |\n| | | \u0627\u0635 |\n| | | \u0627\u0644\u062c\u0630\u0639 |\n| | | \u0627\u0644\u062a\u0644\u0642 |\n| | | \u0627\u0626\u064a |\n| | | \u0645\u0628\u0627\u0634\u0631 |\n| | | \u0629\u060c |\n| | | \u0639\u0646\u062f |\n| | | \u0627\u0644\u0631\u063a\u0628 |\n| | | \u0629 |\n| | | \u0641\u064a |\n| | | \u0627\u0644\u062d\u0635\u0648 |\n| | | \u0644 |\n| | | \u0639\u0644\u0649 |\n| | | \u062c\u0630\u0639 |\n| | | \u0645\u0639\u064a\u0646\u060c |\n| | | \u0646\u062d\u062f\u062f |\n| | | \u062f\u0644\u064a\u0644 |\n| | | \u0627\u0644\u0633\u0627\u0628 |\n| | | \u0642\u060c |\n| | | \u0648\u062f\u0644\u064a\u0644 |\n| | | \u0627\u0644\u0644\u0627\u062d |\n| | | \u0642. |\n+------------+----------------+-------+\n| get\\_left( | Get the prefix | \u0645\u0648\u0636\u0639 |\n| ) | end position | \u0646\u0647\u0627\u064a\u0629 |\n| | | \u0627\u0644\u0633\u0627\u0628 |\n| | | \u0642\u0629 |\n+------------+----------------+-------+\n| get\\_right | Get the suffix | \u0645\u0648\u0636\u0639 |\n| () | start position | \u0628\u062f\u0627\u064a\u0629 |\n| | | \u0627\u0644\u0644\u0627\u062d |\n| | | \u0642\u0629 |\n+------------+----------------+-------+\n| get\\_prefi | return the | \u0627\u0633\u062a\u0631\u062c |\n| x() | prefix/suffix | \u0627\u0639 |\n| | of the treated | \u0627\u0644\u0633\u0627\u0628 |\n| | word by the | \u0642\u0629 |\n| | stemmer. | \u0627\u0644\u062a\u0644\u0642 |\n| | | \u0627\u0626\u064a\u0629 |\n| | | \u0623\u0648 |\n| | | \u0633\u0627\u0628\u0642\u0629 |\n| | | \u0645\u0639\u064a\u0646\u0629 |\n| | | \u0628\u0645\u0648\u0636\u0639 |\n+------------+----------------+-------+\n| get\\_suffi | Get default | \u0627\u0633\u062a\u0631\u062c |\n| x() | suffix, or | \u0627\u0639 |\n| | suffix by | \u0627\u0644\u0644\u0627\u062d |\n| | suffix index | \u0642\u0629 |\n| | | \u0627\u0644\u062a\u0644\u0642 |\n| | | \u0627\u0626\u064a\u0629 |\n| | | \u0623\u0648 |\n| | | \u0628\u0648\u0627\u0633\u0637 |\n| | | \u0629 |\n| | | \u062f\u0644\u064a\u0644 |\n| | | \u0627\u0644\u0644\u0627\u062d |\n| | | \u0642\u0629 |\n+------------+----------------+-------+\n| get\\_affix | Get default | \u0627\u0633\u062a\u0631\u062c |\n| () | Affix or | \u0627\u0639 |\n| | specific by | \u0627\u0644\u0632\u0627\u0626 |\n| | left and right | \u062f\u0629 |\n| | indexes | \u0627\u0644\u062a\u0644\u0642 |\n| | | \u0627\u0626\u064a\u0629 |\n| | | \u0623\u0648 |\n| | | \u0627\u0644\u0645\u0639\u064a |\n| | | \u0646\u0629\u0628\u062f\u0644 |\n| | | \u064a\u0644\u064a |\n| | | \u0627\u0644\u0633\u0627\u0628 |\n| | | \u0642 |\n| | | \u0648\u0627\u0644\u0644\u0627 |\n| | | \u062d\u0642 |\n+------------+----------------+-------+\n| get\\_affix | Get affixe | \u0627\u0633\u062a\u0631\u062c |\n| \\_tuple() | tuple | \u0627\u0639 |\n| | | \u0627\u0644\u0632\u0627\u0626 |\n| | | \u062f\u0629 |\n| | | \u0628\u062a\u0641\u0627\u0635 |\n| | | \u064a\u0644\u0647\u0627 |\n+------------+----------------+-------+\n| get\\_starw | Get stared | \u0627\u0633\u062a\u0631\u062c |\n| ord() | word, radical | \u0627\u0639 |\n| | letters | \u0627\u0644\u062c\u0630\u0639 |\n| | replaced by | \u0627\u0644\u0645\u0646\u062c |\n| | \"*\"\\|\u0627\u0633\u062a\u0631\u062c\u0627\u0639 | \u0645\u060c |\n| | \u0627\u0644\u0643\u0644\u0645\u0629 | \u0627\u0644\u062d\u0631\u0648 |\n| | \u0627\u0644\u0645\u0646\u062c\u0645\u0629\u060c | \u0641 |\n| | \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0623\u0635\u0644\u064a\u0629 | \u0627\u0644\u0623\u0635\u0644 |\n| | \u0645\u062e\u0641\u064a\u0629 \u0628\u0646\u062c\u0648\u0645 | \u064a\u0629 |\n| | get\\_starstem( | \u0645\u062e\u0641\u064a\u0629 |\n| | )\\|Get | \u0628\u0646\u062c\u0648\u0645 |\n| | stared stem, | |\n| | radical | |\n| | letters | |\n| | replaced by | |\n| | \"*\" | |\n+------------+----------------+-------+\n| get\\_unvoc | return the | \u0627\u0633\u062a\u0631\u062c |\n| alized() | unvocalized | \u0627\u0639 |\n| | form of the | \u0627\u0644\u0643\u0644\u0645 |\n| | treated word | \u0629 |\n| | by the | \u063a\u064a\u0631 |\n| | stemmer. | \u0645\u0634\u0643\u0648\u0644 |\n| | Harakat are | \u0629 |\n| | striped. | |\n+------------+----------------+-------+\n\n- \u0627\u0633\u062a\u062e\u0644\u0627\u0635 \u0643\u0644 \u0627\u0644\u062a\u0642\u0633\u064a\u0645\u0627\u062a \u0627\u0644\u0645\u062d\u062a\u0645\u0644\u0629\n\n- \u062a\u0642\u0633\u064a\u0645 \u0627\u0644\u0643\u0644\u0645\u0629 \u0625\u0644\u0649 \u0643\u0644 \u0627\u0644\u0632\u0648\u0627\u0626\u062f \u0627\u0644\u0645\u062d\u062a\u0645\u0644\u0629\n\nGenerate a list of all posibble segmentation positions (lef, right) of\nthe treated word by the stemmer.\n\n.. code:: python\n\n\n >>> word = u'\u0623\u0641\u062a\u0636\u0627\u0631\u0628\u0627\u0646\u0646\u064a'\n\n >>> # Detect all possible segmentation\n ... print ArListem.segment(word) \n set([(2, 7), (3, 8), (0, 8), (2, 9), (2, 8), (3, 10), (2, 11), (1, 8), (0, 7), (2, 10), (3, 11), (1, 10), (0, 11), (3, 9), (0, 10), (1, 7), (0, 9), (3, 7), (1, 11), (1, 9)])\n\n >>># Get all segment \n >>>print ArListem.get_segment_list()\n set([(2, 7), (3, 8), (0, 8), (2, 9), (2, 8), (3, 10), (2, 11), (1, 8), (0, 7), (2, 10), (3, 11), (1, 10), (0, 11), (3, 9), (0, 10), (1, 7), (0, 9), (3, 7), (1, 11), (1, 9)])\n\n >>> # get affix list\n ... print repr(ArListem.get_affix_list() )\n [{'prefix': u'\u0623\u0641', 'root': u'\u0636\u0631\u0628', 'stem': u'\u062a\u0636\u0627\u0631\u0628', 'suffix': u'\u0627\u0646\u0646\u064a'},\n {'prefix': u'\u0623\u0641\u062a', 'root': u'\u0636\u0631\u0628', 'stem': u'\u0636\u0627\u0631\u0628\u0627', 'suffix': u'\u0646\u0646\u064a'},\n {'prefix': u'', 'root': u'\u0623\u0641\u0636\u0631\u0628', 'stem': u'\u0623\u0641\u062a\u0636\u0627\u0631\u0628\u0627', 'suffix': u'\u0646\u0646\u064a'}, \n {'prefix': u'\u0623\u0641', 'root': u'\u0636\u0631\u0628\u0646', 'stem': u'\u062a\u0636\u0627\u0631\u0628\u0627\u0646', 'suffix': u'\u0646\u064a'}, \n {'prefix': u'\u0623\u0641', 'root': u'\u0636\u0631\u0628', 'stem': u'\u062a\u0636\u0627\u0631\u0628\u0627', 'suffix': u'\u0646\u0646\u064a'}, \n {'prefix': u'\u0623\u0641\u062a', 'root': u'\u0636\u0631\u0628\u0646\u0646', 'stem': u'\u0636\u0627\u0631\u0628\u0627\u0646\u0646', 'suffix': u'\u064a'}, ...]\n >>> \n\n- segment() / get\\_segment\\_list() \u0627\u0633\u062a\u062e\u0644\u0627\u0635 \u0642\u0627\u0626\u0645\u0629 \u0645\u0648\u0627\u0636\u0639 \u0643\u0644 \u0627\u0644\u062a\u0642\u0633\u064a\u0645\u0627\u062a\n \u0627\u0644\u0645\u062d\u062a\u0645\u0644\u0629 \u0639\u0644\u0649 \u0634\u0643\u0644 \u0623\u0639\u062f\u0627\u062f return a list of segmentation positions (left,\n right) of the treated word by the stemmer.\n\n- get\\_affix\\_list \u0627\u0633\u062a\u062e\u0644\u0627\u0635 \u0642\u0627\u0626\u0645\u0629 \u0643\u0644 \u0627\u0644\u0632\u0648\u0627\u0626\u062f \u0627\u0644\u0645\u062d\u062a\u0645\u0644\u0629\n\nreturn a list of affix tuple of the treated word by the stemmer.\n\nCustomized Affix list\n---------------------\n\n\u062a\u062e\u0635\u064a\u0635 \u0642\u0648\u0627\u0626\u0645 \u0627\u0644\u0632\u0648\u0627\u0626\u062f \u064a\u0645\u0643\u0646\u0646 \u062a\u062e\u0635\u064a\u0635 \u0642\u0648\u0627\u0626\u0645 \u0627\u0644\u0633\u0648\u0627\u0628\u0642 \u0648\u0627\u0644\u0644\u0648\u0627\u062d\u0642 \u0644\u0644\u062d\u0635\u0648\u0644 \u0639\u0644\u0649 \u0646\u062a\u0627\u0626\u062c\n\u0627\u0641\u0636\u0644 \u062d\u0633\u0628 \u0627\u0644\u0633\u064a\u0627\u0642\n\n\u0641\u064a \u0627\u0644\u0645\u062b\u0627\u0644 \u0627\u0644\u0645\u0648\u0627\u0644\u064a\u060c \u0633\u0646\u0633\u062a\u0639\u0645\u0644 \u0645\u062c\u0630\u0639 \u062a\u0627\u0634\u0641\u064a\u0646 \u062d\u0633\u0628 \u0642\u0648\u0627\u0626\u0645\u0647 \u0627\u0644\u062a\u0644\u0642\u0627\u0626\u064a\u0629\u060c \u062b\u0645 \u0646\u0635\u0646\u0639\n\u0645\u062c\u0630\u0639\u0627 \u0622\u062e\u0631 \u064a\u0639\u0637\u064a \u0646\u062a\u0627\u0626\u062c \u0645\u062e\u062a\u0644\u0641\u0629 \u0628\u062a\u062e\u0635\u064a\u0635 \u0642\u0648\u0627\u0626\u0645 \u0627\u0644\u0633\u0648\u0627\u0628\u0642 \u0648\u0627\u0644\u0644\u0648\u0627\u062d\u0642\n\nYou can modify and customize the default affixes list by\n\n.. code:: python\n\n >>> import tashaphyne.stemming\n\n >>> CUSTOM_PREFIX_LIST = [u'\u0643\u0627\u0644', u'\u0623\u0641\u0628\u0627\u0644', u'\u0623\u0641\u0643', u'\u0641\u0643', u'\u0623\u0648\u0644\u0644', u'', u'\u0623\u0641', u'\u0648\u0644', u'\u0623\u0648\u0627\u0644', u'\u0641', u'\u0648', u'\u0623\u0648', u'\u0648\u0644\u0644', u'\u0641\u0628', u'\u0623\u0648\u0644', u'\u0623\u0644\u0644', u'\u0644\u0644', u'\u0628', u'\u0648\u0643\u0627\u0644', u'\u0623\u0648\u0628', u'\u0628\u0627\u0644', u'\u0623\u0643\u0627\u0644', u'\u0627\u0644', u'\u0623\u0628', u'\u0648\u0628', u'\u0623\u0648\u0628\u0627\u0644', u'\u0623', u'\u0648\u0628\u0627\u0644', u'\u0623\u0643', u'\u0641\u0643\u0627\u0644', u'\u0623\u0648\u0643', u'\u0641\u0644\u0644', u'\u0648\u0643', u'\u0643', u'\u0623\u0644', u'\u0641\u0627\u0644', u'\u0648\u0627\u0644', u'\u0623\u0648\u0643\u0627\u0644', u'\u0623\u0641\u0644\u0644', u'\u0623\u0641\u0644', u'\u0641\u0644', u'\u0623\u0627\u0644', u'\u0623\u0641\u0643\u0627\u0644', u'\u0644', u'\u0623\u0628\u0627\u0644', u'\u0623\u0641\u0627\u0644', u'\u0623\u0641\u0628', u'\u0641\u0628\u0627\u0644']\n >>> CUSTOM_SUFFIX_LIST = [u'\u0643\u0645\u0627', u'\u0643', u'\u0647\u0646', u'\u064a', u'\u0647\u0627', u'', u'\u0647', u'\u0643\u0645', u'\u0643\u0646', u'\u0647\u0645', u'\u0647\u0645\u0627', u'\u0646\u0627']\n\n >>> # simple stemmer with default affixes list\n ... simple_stemmer = tashaphyne.stemming.ArabicLightStemmer()\n\n >>> # create a c\u0639stomized stemmer object for stemming enclitics and procletics\n ... custom_stemmer = tashaphyne.stemming.ArabicLightStemmer()\n >>> # configure the stemmer object\n ... custom_stemmer.set_prefix_list(CUSTOM_PREFIX_LIST)\n >>> custom_stemmer.set_suffix_list(CUSTOM_SUFFIX_LIST)\n >>> \n >>> word = u\"\u0628\u0627\u0644\u0645\u062f\u0631\u0633\u062a\u064a\u0646\"\n >>> # segment word as \n ... simple_stemmer.segment(word)\n set([(4, 10), (4, 7), (4, 9), (4, 8), (3, 10), (0, 7), (3, 8), (1, 10), (1, 8), (3, 9), (0, 10), (1, 7), (0, 9), (3, 7), (0, 8), (1, 9)])\n >>> print repr(simple_stemmer.get_affix_list())\n [{'prefix': u'\u0628\u0627\u0644\u0645', 'root': u'\u062f\u0631\u0633\u062a\u064a\u0646', 'stem': u'\u062f\u0631\u0633\u062a\u064a\u0646', 'suffix': u''}, {'prefix': u'\u0628\u0627\u0644\u0645', 'root': u'\u062f\u0631\u0633', 'stem': u'\u062f\u0631\u0633', 'suffix': u'\u062a\u064a\u0646'}, {'prefix': u'\u0628\u0627\u0644\u0645', 'root': u'\u062f\u0631\u0633\u062a\u064a', 'stem': u'\u062f\u0631\u0633\u062a\u064a', 'suffix': u'\u0646'}, {'prefix': u'\u0628\u0627\u0644\u0645', 'root': u'\u062f\u0631\u0633\u062a', 'stem': u'\u062f\u0631\u0633\u062a', 'suffix': u'\u064a\u0646'}, {'prefix': u'\u0628\u0627\u0644', 'root': u'\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'stem': u'\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'suffix': u''}, {'prefix': u'', 'root': u'\u0628\u0627\u0644\u0645\u062f\u0631\u0633', 'stem': u'\u0628\u0627\u0644\u0645\u062f\u0631\u0633', 'suffix': u'\u062a\u064a\u0646'}, ...]\n >>> \n >>> custom_stemmer.segment(word)\n set([(1, 10), (3, 10), (0, 10)])\n >>> \n >>> print repr(custom_stemmer.get_affix_list())\n [{'prefix': u'\u0628', 'root': u'\u0627\u0644\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'stem': u'\u0627\u0644\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'suffix': u''}, {'prefix': u'\u0628\u0627\u0644', 'root': u'\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'stem': u'\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'suffix': u''}, {'prefix': u'', 'root': u'\u0628\u0627\u0644\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'stem': u'\u0628\u0627\u0644\u0645\u062f\u0631\u0633\u062a\u064a\u0646', 'suffix': u''}]\n >>> \n\nThis command *set\\_prefix\\_list* and \\*set\\_suffix\\_list\" will rebuild\nthe Finite state automaton to consider new affixes list.\n\nPackage Documentation\n=====================\n\nFiles\n=====\n\n- file/directory category description\n\n- [docs] docs/ docs documentation\n\n- [support]\n\n - pyarabic : basic arabic library\n\n- [test]\n\n - output/ test test output\n - samples/ test sample files\n - tools/ test script to use tashaphyne\n\nFeatured Posts\n--------------\n\nIf you would cite it in academic work, can you use this citation\n\n::\n\n T. Zerrouki\u200f, Tashaphyne, Arabic light stemmer\u200f, https://pypi.python.org/pypi/Tashaphyne/0.2\n\nor in bibtex format\n\n\n\n\n", "description_content_type": "", "docs_url": "https://pythonhosted.org/Tashaphyne/", "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/linuxscout/tashaphyne/", "keywords": "", "license": "GPL", "maintainer": "", "maintainer_email": "", "name": "Tashaphyne", "package_url": "https://pypi.org/project/Tashaphyne/", "platform": "", "project_url": "https://pypi.org/project/Tashaphyne/", "project_urls": { "Homepage": "http://github.com/linuxscout/tashaphyne/" }, "release_url": "https://pypi.org/project/Tashaphyne/0.3.4/", "requires_dist": [ "pyarabic" ], "requires_python": "", "summary": "Tashaphyne Arabic Light Stemmer", "version": "0.3.4" }, "last_serial": 5764198, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "8cac5c5ab84327313d1697aa002d2b36", "sha256": "f54163d413c67bd862cbf83ec2516f85182f5f058d7b561e6e4ea97635a7c644" }, "downloads": -1, "filename": "Tashaphyne-0.1.win32.exe", "has_sig": false, "md5_digest": "8cac5c5ab84327313d1697aa002d2b36", "packagetype": "bdist_wininst", "python_version": "2.6", "requires_python": null, "size": 178388, "upload_time": "2010-02-28T22:33:34", "url": "https://files.pythonhosted.org/packages/97/1d/58e34f01b9697a8fb191f7f2d381850ab7bdd1b3bc26bc55d1f91aa1033e/Tashaphyne-0.1.win32.exe" }, { "comment_text": "", "digests": { "md5": "29cde2cc10c9f5e9ed1eafebda3c78e2", "sha256": "50d425af56b7b3c377b9c47af1d6df688e0ce2ffa9bb1a34c028832d73687c86" }, "downloads": -1, "filename": "Tashaphyne-0.1.zip", "has_sig": false, "md5_digest": "29cde2cc10c9f5e9ed1eafebda3c78e2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11058, "upload_time": "2010-02-28T22:39:19", "url": "https://files.pythonhosted.org/packages/a8/8a/7a730e631423add92342b687e171b42281512aca3299ef5820777d0c28fc/Tashaphyne-0.1.zip" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "5f3c3a1b2068418095b572edec04c47b", "sha256": "78ad363770ed97674b31f0a5e8837ef4e6e4b1e59d62556572c5ff16834ead88" }, "downloads": -1, "filename": "Tashaphyne-0.2.win32.exe", "has_sig": false, "md5_digest": "5f3c3a1b2068418095b572edec04c47b", "packagetype": "bdist_wininst", "python_version": "any", "requires_python": null, "size": 316631, "upload_time": "2012-03-28T18:31:29", "url": "https://files.pythonhosted.org/packages/71/cf/dc6225a195ed111c9ff4dc9644f41ecda2079b7139879df70fe540a08e97/Tashaphyne-0.2.win32.exe" }, { "comment_text": "", "digests": { "md5": "07ae239fb8aef7d4af2f347d4b81ab21", "sha256": "cf9f6fae718e527c3242f23a8b15db068d262a85f9c27b69ff034ac4c0b6832c" }, "downloads": -1, "filename": "Tashaphyne-0.2.zip", "has_sig": false, "md5_digest": "07ae239fb8aef7d4af2f347d4b81ab21", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10386, "upload_time": "2012-03-28T18:31:47", "url": "https://files.pythonhosted.org/packages/80/23/a49da799223980cb8afdab1645cb194703175c27a2df798f4e4035d5f9c3/Tashaphyne-0.2.zip" }, { "comment_text": "", "digests": { "md5": "9dba7d87643e300d60efff781df5f036", "sha256": "f4fb22682bc1a4ff64027bbcaacfad0ecadfa602673ee7c760734a55e761f677" }, "downloads": -1, "filename": "tashaphyne-python_0.2-1_all.deb", "has_sig": false, "md5_digest": "9dba7d87643e300d60efff781df5f036", "packagetype": "bdist_dumb", "python_version": "2.7", "requires_python": null, "size": 9560, "upload_time": "2013-02-10T13:46:59", "url": "https://files.pythonhosted.org/packages/07/7d/828f0b9a1b13d8b4f40807228e204939aaa40347db6fde2eb2831e8b112a/tashaphyne-python_0.2-1_all.deb" }, { "comment_text": "", "digests": { "md5": "f2d8cbef5f433cdc58c03fc2234f0c58", "sha256": "7808d2e53a61f2fe4f2ce656c70f0ac5d34c232b7684fad3a621f7dd981fcc4f" }, "downloads": -1, "filename": "tashaphyne-python-0.2-1.noarch.rpm", "has_sig": false, "md5_digest": "f2d8cbef5f433cdc58c03fc2234f0c58", "packagetype": "bdist_rpm", "python_version": "2.7", "requires_python": null, "size": 11836, "upload_time": "2013-02-10T13:46:19", "url": "https://files.pythonhosted.org/packages/db/8f/73d943adc9d0bf63f6d03cbb339231aa662f94e001068788140bcf892cc9/tashaphyne-python-0.2-1.noarch.rpm" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "bf2458ca1435219d03e3bf08f9b88a6d", "sha256": "60d9f85a942155b24d3e3f937c1f7beeb44f423c0a650600bef786f60540a0d3" }, "downloads": -1, "filename": "Tashaphyne-0.3-py2-none-any.whl", "has_sig": false, "md5_digest": "bf2458ca1435219d03e3bf08f9b88a6d", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 19826, "upload_time": "2017-02-15T20:08:52", "url": "https://files.pythonhosted.org/packages/f9/96/c7090d79579415a8a33324b1da5455558f784b74791df5034f5a62b18363/Tashaphyne-0.3-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8fc5d5a07d9c6813b51da1fd46027f44", "sha256": "009fd233ea77d9ffc0746d078c17f04fdb128bf09c24bf2d7891d66adf3b99b4" }, "downloads": -1, "filename": "Tashaphyne-0.3.tar.gz", "has_sig": false, "md5_digest": "8fc5d5a07d9c6813b51da1fd46027f44", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17759, "upload_time": "2017-02-15T20:09:01", "url": "https://files.pythonhosted.org/packages/15/09/0d69125693136ad91691dd9d2a10c1422175a6537a2b97abee441dc39e87/Tashaphyne-0.3.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "fc3e3ff013d891958193a5544fd464a4", "sha256": "c1fc09f58355e9247fef825fd55292004de3bfd084b383771a7b14c22178be79" }, "downloads": -1, "filename": "Tashaphyne-0.3.1-py2-none-any.whl", "has_sig": false, "md5_digest": "fc3e3ff013d891958193a5544fd464a4", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 19918, "upload_time": "2017-02-16T11:47:03", "url": "https://files.pythonhosted.org/packages/15/3e/6d7b9559a75e2d9436349e0bb0ab1b684c56086e3961dee851fe7fcb4b0c/Tashaphyne-0.3.1-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3833876ac5d28c1d3d038fcd1b1cdd4f", "sha256": "6cfd6d90fa91805fcc260a11dbcca310abac92b513906a931a722caa3f0f816a" }, "downloads": -1, "filename": "Tashaphyne-0.3.1-py3-none-any.whl", "has_sig": false, "md5_digest": "3833876ac5d28c1d3d038fcd1b1cdd4f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19918, "upload_time": "2017-02-16T11:47:06", "url": "https://files.pythonhosted.org/packages/b4/43/cd0cdd4069744be422fd6ddffe13adc50dbd2c18a6a260112f517c6959eb/Tashaphyne-0.3.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "282441542c393dfbc434eaf538f721e1", "sha256": "b2a98a03cabd12b7641bb3dee5d3430e57809b1c67260cce1d1c39887b3f2f00" }, "downloads": -1, "filename": "Tashaphyne-0.3.1.tar.gz", "has_sig": false, "md5_digest": "282441542c393dfbc434eaf538f721e1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17855, "upload_time": "2017-02-16T11:47:08", "url": "https://files.pythonhosted.org/packages/76/e5/398368a6d385a8c3a27cc024af8a6dc0edb83b8e30ebd560f6f9db65ab39/Tashaphyne-0.3.1.tar.gz" } ], "0.3.2": [ { "comment_text": "", "digests": { "md5": "8120d897b7783b4933425b0dac866438", "sha256": "c2739d9c0cabcd676567e0573b5cbc112f955f517891096149409300c710e5be" }, "downloads": -1, "filename": "Tashaphyne-0.3.2-py2-none-any.whl", "has_sig": false, "md5_digest": "8120d897b7783b4933425b0dac866438", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 20191, "upload_time": "2018-04-27T21:23:41", "url": "https://files.pythonhosted.org/packages/ba/64/c34a5bd3a44c97e7abd710a129d304cb4889949ae9137b5b5f060e41dcc4/Tashaphyne-0.3.2-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d75b0afb6451a5bd293ca75e042aa156", "sha256": "5d3f6c32aac3eb9d3a65a2f78ab5251cbd94b451aabf13d7fc268210e14260f4" }, "downloads": -1, "filename": "Tashaphyne-0.3.2-py3-none-any.whl", "has_sig": false, "md5_digest": "d75b0afb6451a5bd293ca75e042aa156", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20192, "upload_time": "2018-04-27T21:23:58", "url": "https://files.pythonhosted.org/packages/77/7d/b431adae1b272bd67d79bca8d749fe93ce3cd40b6009b4c4697cd5006c88/Tashaphyne-0.3.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6aad9a0491f5f10e12e4664d158c4406", "sha256": "6b9fe98281417a80956bd41b209cd7dac7853e5852260e41beefb308d5401f03" }, "downloads": -1, "filename": "Tashaphyne-0.3.2.tar.gz", "has_sig": false, "md5_digest": "6aad9a0491f5f10e12e4664d158c4406", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17972, "upload_time": "2018-04-27T21:24:28", "url": "https://files.pythonhosted.org/packages/01/93/3e48cfbbbabc7daa98151475b105582a51c5e952fdd81d47bc18ee8e758f/Tashaphyne-0.3.2.tar.gz" } ], "0.3.3.2": [ { "comment_text": "", "digests": { "md5": "69f9e32496b83d1e5c3467674a400fdd", "sha256": "471c2dba24a2a682a028d5463eded1f8727900159ef66c3fc0e965faf1c346aa" }, "downloads": -1, "filename": "Tashaphyne-0.3.3.2-py2-none-any.whl", "has_sig": false, "md5_digest": "69f9e32496b83d1e5c3467674a400fdd", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 38366, "upload_time": "2019-08-23T11:51:39", "url": "https://files.pythonhosted.org/packages/00/54/501125556288a416b525ba30440c0c313f96c5b591eec23dc5b24e14c1c1/Tashaphyne-0.3.3.2-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "50f8bca3113f8635b064a4acedb8f8cc", "sha256": "04aea1ffa1c6caf1a4a833a55d0f9db713c66a7b3960c199a587bf08d10aca40" }, "downloads": -1, "filename": "Tashaphyne-0.3.3.2-py3-none-any.whl", "has_sig": false, "md5_digest": "50f8bca3113f8635b064a4acedb8f8cc", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 43422, "upload_time": "2019-08-23T11:51:27", "url": "https://files.pythonhosted.org/packages/a8/73/072eaa7221e6667552cfd04f4063be245f57da479619311c9966acd131d8/Tashaphyne-0.3.3.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e51b5f793bb6362b8f122b6c5fdbfe1c", "sha256": "93ee8b066c38b703b15e17a19f3c5ed94789d1a2f79463dec0bcfa918ccb31f6" }, "downloads": -1, "filename": "Tashaphyne-0.3.3.2.tar.gz", "has_sig": false, "md5_digest": "e51b5f793bb6362b8f122b6c5fdbfe1c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 42613, "upload_time": "2019-08-23T11:52:01", "url": "https://files.pythonhosted.org/packages/3b/b3/e9ae4c7a21e0f53e65ea9c0cd3e583a72eb1c46bd8d18f26e07eaabe5640/Tashaphyne-0.3.3.2.tar.gz" } ], "0.3.4": [ { "comment_text": "", "digests": { "md5": "105d29bb97395c4e782540763460bd90", "sha256": "0f06e72f66226d59dbc6516bfbefb785a5a33b734593c7d5679ebd0e5419ea17" }, "downloads": -1, "filename": "Tashaphyne-0.3.4-py2-none-any.whl", "has_sig": false, "md5_digest": "105d29bb97395c4e782540763460bd90", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 265729, "upload_time": "2019-08-31T11:40:33", "url": "https://files.pythonhosted.org/packages/e1/23/7e659d7ffcebd822140655a90a4317973e2aed8da20c05cccdfd164cc60a/Tashaphyne-0.3.4-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1925e2aa16de53bbab8dcf3f925ab897", "sha256": "39d9236de0a505419a83d4db61df9b51384d62036629ef4cbf414114b79b44bf" }, "downloads": -1, "filename": "Tashaphyne-0.3.4-py3-none-any.whl", "has_sig": false, "md5_digest": "1925e2aa16de53bbab8dcf3f925ab897", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 244193, "upload_time": "2019-08-31T11:40:46", "url": "https://files.pythonhosted.org/packages/e8/8b/bca9d84a0c381da44791c03e0f4d7b12127df03311c46e5ddda0d9f5674f/Tashaphyne-0.3.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e2adc81db0d83cfb784e2eca773b6c0a", "sha256": "4aa839b15bd6dd5465268a610366c9d9b4572f719e591a9e0ef50ae6d00c6778" }, "downloads": -1, "filename": "Tashaphyne-0.3.4.tar.gz", "has_sig": false, "md5_digest": "e2adc81db0d83cfb784e2eca773b6c0a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 241677, "upload_time": "2019-08-31T11:41:00", "url": "https://files.pythonhosted.org/packages/ef/77/4f307adefa72a60e10d7baf9c78667dd8924c85c02ef7bad0bc6ac8f070d/Tashaphyne-0.3.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "105d29bb97395c4e782540763460bd90", "sha256": "0f06e72f66226d59dbc6516bfbefb785a5a33b734593c7d5679ebd0e5419ea17" }, "downloads": -1, "filename": "Tashaphyne-0.3.4-py2-none-any.whl", "has_sig": false, "md5_digest": "105d29bb97395c4e782540763460bd90", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 265729, "upload_time": "2019-08-31T11:40:33", "url": "https://files.pythonhosted.org/packages/e1/23/7e659d7ffcebd822140655a90a4317973e2aed8da20c05cccdfd164cc60a/Tashaphyne-0.3.4-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1925e2aa16de53bbab8dcf3f925ab897", "sha256": "39d9236de0a505419a83d4db61df9b51384d62036629ef4cbf414114b79b44bf" }, "downloads": -1, "filename": "Tashaphyne-0.3.4-py3-none-any.whl", "has_sig": false, "md5_digest": "1925e2aa16de53bbab8dcf3f925ab897", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 244193, "upload_time": "2019-08-31T11:40:46", "url": "https://files.pythonhosted.org/packages/e8/8b/bca9d84a0c381da44791c03e0f4d7b12127df03311c46e5ddda0d9f5674f/Tashaphyne-0.3.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e2adc81db0d83cfb784e2eca773b6c0a", "sha256": "4aa839b15bd6dd5465268a610366c9d9b4572f719e591a9e0ef50ae6d00c6778" }, "downloads": -1, "filename": "Tashaphyne-0.3.4.tar.gz", "has_sig": false, "md5_digest": "e2adc81db0d83cfb784e2eca773b6c0a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 241677, "upload_time": "2019-08-31T11:41:00", "url": "https://files.pythonhosted.org/packages/ef/77/4f307adefa72a60e10d7baf9c78667dd8924c85c02ef7bad0bc6ac8f070d/Tashaphyne-0.3.4.tar.gz" } ] }