{ "info": { "author": "C\u00e9line Buret", "author_email": "buret.celine@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Intended Audience :: Developers", "Intended Audience :: End Users/Desktop", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License (GPL)", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Topic :: Scientific/Engineering" ], "description": "========\r\npylmflib\r\n========\r\n\r\nLatest version: 1.0\r\n\r\nDate: October 21, 2015\r\n\r\nAuthor: C\u00e9line Buret\r\n\r\nMaintainer: S\u00e9verine Guillaume\r\n\r\nDocumentation: http://himalco.huma-num.fr/documentation/index.htm\r\n\r\nHome page: https://github.com/buret/pylmflib\r\n\r\nLicense: gpl-3.0\r\n\r\nPlatform: Unix, Linux, Windows, MAC\r\n\r\nPackage index owner: C\u00e9line Buret\r\n\r\nIntroduction\r\n=============\r\n\r\nWhat is pylmflib?\r\n___________________\r\n\r\nThe Python LMF library is a suite of open-source Python modules for dictionary format conversion. It performs automatic tasks for multi-languages dictionaries, such as conversion between different formats used for dictionaries.\r\n\r\nThe main idea of ``pylmflib`` is to provide a software package which integrates conversion functions from MDF format to several output formats: LaTeX (PDF), docx, HTML, etc.\r\n\r\n``pylmflib`` implements the LMF standard. For more details, please see http://www.lexicalmarkupframework.org.\r\n\r\nWhat can be done with pylmflib?\r\n__________________________________\r\n\r\nWith the help of ``pylmflib``, users can:\r\n - convert a dictionary from a regular MDF format issued from Toolbox to a PDF printable document,\r\n - convert a dictionary from a regular MDF format issued from Toolbox to a docx editable document,\r\n - customise markers used in Toolbox to match the LMF internal format,\r\n - keep an archivable format of their dictionary in XML LMF,\r\n - display their dictionary online using an XSL conversion from XML LMF to HTML.\r\n\r\nHow can pylmflib be used?\r\n_____________________________\r\n\r\n``pylmflib`` is a library written in the Python programming language. It can be used directly in the Python interpreter or imported into Python scripts.\r\nFor more information about Python, see http://www.python.org.\r\n\r\nHow to cite pylmflib?\r\n________________________\r\n\r\nIf you are using ``pylmflib`` for non-commercial, scientific projects, please cite the library in its current state along with the version that you used:\r\n\r\nBuret, C\u00e9line (2015): pylmflib. Python Library for Automatic Tasks in Multi-Languages Dictionaries. Version 1.0 (Uploaded on 2015-10-21). URL: http://www.pylmflib.org.\r\n\r\nInstallation\r\n=============\r\n\r\nBasic setup\r\n______________\r\n\r\nUse pip to install ``pylmflib`` package from PyPI:\r\n::\r\n\r\n\t$ pip install pylmflib\r\n\r\nUsage\r\n____________\r\n\r\nIn order to use the library, open Python2 in your terminal and import ``pylmflib`` as follows:\r\n::\r\n\r\n\t>>> from pylmflib import *\r\n\r\nDependencies\r\n___________________\r\n\r\nIndispensable third party libraries\r\n++++++++++++++++++++++++++++++++++++++\r\n\r\nHere is the list of the libraries without which ``pylmflib`` won't work.\r\n\r\nRegex: http://pypi.python.org/pypi/regex\r\n\r\nRecommended third party libraries\r\n++++++++++++++++++++++++++++++++++++++++\r\n\r\nHere is a list of the libraries without which ``pylmflib`` core functions will work, but which are anyway used quite frequently in a lot of modules.\r\n\r\nDocx: https://pypi.python.org/pypi/python-docx\r\n\r\nODF: https://pypi.python.org/pypi/odfpy\r\n\r\nSetup for development version\r\n__________________________________\r\n\r\nPrerequisite\r\n+++++++++++++++\r\n\r\nInstall git.\r\n\r\nSetup with git\r\n++++++++++++++++++\r\n\r\nIf you want to regularly work on ``pylmflib``, open a (git) terminal and type in the following:\r\n::\r\n\r\n\t$ git clone https://github.com/buret/pylmflib\r\n\r\nInstructions for a basic installation on Linux and Mac\r\n_______________________________________________________\r\n\r\nPrerequisites on Linux and Mac\r\n+++++++++++++++++++++++++++++++++++++++\r\n\r\nBefore being able to run ``pylmflib``, you will need to follow these steps:\r\n\r\n1. git\r\n::\r\n\r\n\t$ sudo apt-get install git\r\n\t$ git clone https://github.com/buret/pylmflib pylmflib\r\n\r\n2. setuptools\r\n::\r\n\r\n\t$ wget https://bootstrap.pypa.io/ez_setup.py -O - | sudo python\r\n\r\n3. python-docx\r\n\r\nDownload ``python-docx-0.8.5.tar.gz`` : https://pypi.python.org/pypi/python-docx\r\n::\r\n\r\n\t$ tar xvzf python-docx-0.8.5.tar.gz\r\n\t$ cd python-docx-0.8.5/\r\n\t$ sudo python setup.py install\r\n\r\n4. xsltproc\r\n::\r\n\r\n\t$ sudo apt-get install xsltproc\r\n\r\n5. xelatex\r\n::\r\n\r\n\t$ sudo apt-get install texlive\r\n\t$ sudo apt-get install texlive-xetex\r\n\r\n6. Charis SIL\r\n\r\nDownload : http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=charissil_download\r\n\r\nInstall : http://scripts.sil.org/cms/scripts/page.php?item_id=DecompressUtil\r\n\r\n7. MingLiU\r\n\r\nDownload : http://www.fontpalace.com/font-download/MingLiU/\r\n\r\n8. ArialUnicodeMS\r\n\r\nDownload : https://code.google.com/p/tuanphamvu/downloads/detail?name=Arial%20Unicode%20MS.rar&can=2&q=\r\n\r\n9. Copy audio files if any.\r\n\r\npylmflib installation on Linux and Mac\r\n+++++++++++++++++++++++++++++++++++++++++++++++++++++\r\n\r\nWe recommend to use the stable version of ``pylmflib`` (1.0). Make sure that ``regex`` is installed on you system prior to installing ``pylmflib``. In order to install this version, simply download it from https://github.com/buret/pylmflib or https://pypi.python.org/pypi/pylmflib/1.0, unpack the directory, then ``cd`` into it, and type in the prompt:\r\n::\r\n\r\n\t$ python setup.py install\r\n\r\nYou may need sudo-rights to carry out these command.\r\n\r\nAt this stage, you can run the unit tests:\r\n::\r\n\r\n\t$ test/test_all.py\r\n\r\nAnd you could run all provided examples:\r\n::\r\n\r\n\t$ examples/Bambara/bambara.py\r\n\t$ examples/japhug/dict_japhug.py\r\n\t$ examples/khaling/dict_khaling.py\r\n\t$ examples/na/dict_na.py\r\n\t$ examples/test/scenario.py\r\n\t$ examples/yuanga/dict_yuanga.py\r\n\r\nInstallation instructions on Windows\r\n________________________________________\r\n\r\nPrerequisites on Windows\r\n++++++++++++++++++++++++++++++++++\r\n\r\nBefore being able to install ``pylmflib-1.0``, you will need to install:\r\n\r\n1. ``pip-7.1.2``\r\n2. ``VCForPython27.msi``\r\n3. ``python-docx-0.8.5``\r\n4. ``lxml-2.0.3``\r\n\r\nIn some cases, you may need to install:\r\n\r\n * ``setuptools-18.4``\r\n * ``ez_setup.py``\r\n * ``get-pip.py``\r\n\r\npylmflib installation on Windows\r\n++++++++++++++++++++++++++++++++++++++++++\r\n\r\nThe current version of ``pylmflib`` for Python2 should basically also run on Windows. In order to install ``pylmflib`` on a Windows machine, I recommend to use the Cygwin terminal and install ``pylmflib`` in the same way in which one would otherwise install it on Linux or Mac machines.\r\n\r\nWorkarounds\r\n___________________\r\n\r\nTo use the library without installing it, i.e. without running the setup-command, a simple way to use ``pylmflib`` is to include it in your sys-path just before you call the library:\r\n::\r\n\r\n\t>>> import sys\r\n\t>>> sys.path.append(\"path_to_pylmflib)\r\n\r\nCode\r\n======\r\n\r\nSource code is available at: https://github.com/buret/pylmflib\r\n\r\n``pylmflib`` has been developed in Python 2.7.5.\r\n\r\nIt is under GPL licence.\r\n\r\nBasic modules\r\n_____________________\r\n\r\nThe library in its current state consists of the following modules:\r\n * common\r\n * config\r\n * core\r\n * input\r\n * morphology\r\n * morphosyntax\r\n * mrd\r\n * output\r\n * resources\r\n * utils\r\n\r\nBasic formats\r\n____________________\r\n\r\nIn the following, we list some of the formats that are frequently used by ``pylmflib``, be it that they are taken as input formats, or that they are produced as output from the classes and methods provided by ``pylmflib``:\r\n\r\n* MDF\r\n* XML LMF\r\n* LaTeX\r\n* docx\r\n\r\nHere is a list of formats that can be used, but need to be further developed, i.e. integration has been done but implementation has to be completed:\r\n\r\n* XML TEI\r\n* HTML\r\n* ODT\r\n\r\nFormats that have to be added to the library in the future:\r\n\r\n* xls / csv\r\n* Elan\r\n* XML ITE\r\n* XML LIFT\r\n* XML LexiquePro\r\n* XML OLIF\r\n* XML Toolbox\r\n\r\nCoding conventions\r\n_________________________\r\n\r\nPlease respect the coding rules used in the library.\r\n\r\nTest\r\n======\r\n\r\nFor tests, I use the ``unittest``Python library. To run the tests, just enter the main directory and call ``test/test_all.py`` on the command line. Please do not commit any changes without all tests running without failure or error.\r\n\r\nAll tests are in a directory ``test/`` within the main directory. For each Python source file in the source directory, there is a test file with a prefix ``test_``. For example, the tests of the ``core`` module, which has its source in ``pylmflib/core/``, are located in ``test/test_core_xxx.py``. Within the test files, there is a class defined for each class in the original source files, with a prefix ``Test``. For example, there is a class ``TestLexicalEntry`` defined in ``test_core_lexical_entry.py`` as there is a class ``LexicalEntry`` in ``lexical_entry.py``. For each method of a class, the test class has a method with the prefix ``test_``. For example, the method ``create_related_form()`` of the ``LexicalEntry`` class is tested with the method ``test_create_related_form()`` of the test class.\r\n\r\nDocumentation\r\n=============\r\n\r\nIf you contribute to ``pylmflib``, you should document your code.\r\nThe first step for documentation is the documentation within the code.\r\n\r\nCurrently, documentation is created using the following steps:\r\n\r\n- Whenever code is added to ``pylmflib``, the contributors add documentation inline in their code, following the style used in the project.\r\n- Then, they run ``Doxygen`` using the ``Doxyfile`` provided under ``doc/Doxygen``.\r\n- The general website structure is added around the code. You can find its content by browsing the ``doc/Doxygen/html/`` directory.\r\n\r\nExamples\r\n==========\r\n\r\nWorkflow example\r\n_______________________\r\n\r\nThis is an example workflow that illustrates some of the functionalities of ``pylmflib``. We start with a small dataset from the Bambara language.\r\n\r\nGetting started\r\n+++++++++++++++++++++++++++++\r\n\r\nFirst, make sure to have the Python LMF library downloaded, extracted and installed properly. The dataset that will be used is located under ``examples/Bambara``.\r\n\r\nThis folder includes a Python script that runs the whole code from the beginning to the end. In order to start the conversion, go under the main directory and run this script:\r\n::\r\n\r\n\t$ python examples/Bambara/bambara.py\r\n\r\nAs a result, the following files will appear in the result directory:\r\n\r\n* ``Bambara.docx``, that shows an example of a Microsoft Word document that you can obtain ;\r\n* ``Bambara.tex``, that you must compile using XeLaTeX to get a PDF printable dictionary ;\r\n* ``Bambara.txt``, which is similar to the input database ``BambaraDemo.db`` in MDF format ;\r\n* ``Bambara.xml``, which is the XML LMF representation of the dictionary.\r\n\r\nYou can also directly run the conversion and XeLaTeX command by running ``bambara.sh`` or ``bambara.bat`` depending on your operating system.\r\n\r\nPython scripts\r\n++++++++++++++++++++++++++++++\r\n\r\n* ``bambara.py``\r\n\r\nIt is the main script, the one which calls ``pylmflib`` functions:\r\n\r\n1. ``read_config``\r\n2. ``read_mdf``\r\n3. ``read_sort_order``\r\n4. ``write_xml_lmf``\r\n5. ``write_tex``\r\n6. ``write_mdf``\r\n7. ``write_doc``\r\n\r\nSo the basic steps are:\r\n\r\n1. to read the configuration defined in ``config.xml`` (see the tutorial chapter below for details) ;\r\n2. to read the MDF file, so in this case the ``BambaraDemo.db`` Toolbox dictionary ;\r\n3. to read the alphabetical order defined in ``sort_order.xml`` (see the tutorial chapter below for details) ;\r\n4. to convert the MDF text format into a structured XML format, based on LMF standard ;\r\n5. to generate an output LaTeX file ;\r\n6. to generate an MDF file, similar to the input one ;\r\n7. to generate an output document file.\r\n\r\nIn this script, user also has access to all ``pylmflib`` objects methods, which are fully documented at:\r\nhttp://himalco.huma-num.fr/documentation/index.htm\r\n\r\n* ``setting.py``\r\n\r\nTo be able to customise some Python variables, it is possible to write a ``setting.py file``, in which user can:\r\n\r\n - define the items to sort: in this case, we choose to sort the ``lx`` MDF marker contents, but it could be any other field ;\r\n - customise input MDF markers used by modifying the ``mdf_lmf`` Pyhton variable ;\r\n - customise output MDF markers by modifying the ``lmf_mdf`` Python variable.\r\n\r\nIt is also possible to customise Python functions. See the other examples below for more advanced use.\r\n\r\n* ``startup.py``\r\n\r\nThis file is needed to define working path and path to the library. Normally, you should not have to modify it.\r\n\r\nBasic example\r\n__________________\r\n\r\nA simple example is presented under ``examples/test``. All available output formats are generated:\r\n\r\n * XML LMF\r\n * LaTeX\r\n * MDF\r\n * docx\r\n * ODT\r\n * HTML\r\n * XML TEI\r\n\r\nNote that conversion scripts from XML LMF to HTML, ODT and XML TEI are here as examples to show what is possible to do. They have to be reworked to generate user-friendly outputs.\r\n\r\nPDF examples\r\n___________________\r\n\r\nIt is possible to fully customise the desired output. There are three examples to generate customised PDF printable dictionaries, located under ``examples/japhug``, ``examples/khaling`` and ``examples/na``.\r\n\r\nIn all cases, the file ``setting.py`` has been deeply modified. The most important function is ``lmf2tex()``, which role is to organise data information in the LaTeX output file. If user do not provide this Python function, there is a default function for basic presentation. Again, coding details about this function is available at:\r\nhttp://himalco.huma-num.fr/documentation/index.htm\r\n\r\nDocx example\r\n____________________\r\n\r\nIt is also possible to customise a document output. There is an example to generate a customised docx editable dictionary, located under ``examples/yuanga``.\r\n\r\nMoreover, in this case, entries are not classified by alphabetical order, but by semantic domain.\r\n\r\nChapter titles of the output docx document are defined in ``setting.py``, with ``order`` then ``sd_order`` variables.\r\n\r\nMoreover, part of speech authorised values have been deeply extended by modifying the ``ps_partOfSpeech`` Python variable.\r\n\r\nTutorial\r\n==========\r\n\r\nConfiguration files\r\n_______________________\r\n\r\nThis part is an overview of the configuration files you may have to customise.\r\n\r\n* ``config.xml``\r\n\r\nThe root element is named ``Config``. It contains following elements that user has to set.\r\n\r\n\t``Language``: define the vernacular, national, regional and other languages that you have to use in your multi-languages dictionary, by setting the ISO-639-3 code value (usually composed of three letters).\r\n\r\n\t``Font``: define fonts to use for LaTeX output format if needed ; for each defined language, a font has to be defined using LaTeX commands.\r\n\r\n\t``LMF``: define ``GlobalInformation`` and ``Lexicon`` attributes of ``LexicalResource`` (author, version, dictionary description and title, identifier, etc.) ; among these settings, two are very important to define: ``entrySource`` must point to the dictionary MDF input file, and ``localPath`` must point to the folder where your audio files are located if you have any.\r\n\r\n\t``MDF``: here you can define your own part of speech values if you do not use standard ones defined in MDF.\r\n\r\n\t``LaTeX``: not implemented.\r\n\r\n* ``introduction.tex``\r\n\r\nIf user wants to insert an introduction in his dictionary, here is the file to write it. It has to use LaTeX commands.\r\n\r\n* ``preamble.tex``\r\n\r\nThis file is used to define all LaTeX packages that will be needed to compile your LaTeX output file. You have to update it if you customise the ``lmf2tex()`` function by using non-basic LaTeX commands.\r\n\r\n* ``sort_order.xml``\r\n\r\nIf you want your dictionary classified by a specific alphabetical order or if you use IPA or special characters, you have to write your own ``sort_order.xml`` file. Format is simple: for each character, you have to define a rank value.\r\n\r\nFor any of the settings defined above, please refer to examples for the exact syntax to respect.\r\n\r\nLibrary options\r\n______________________\r\n\r\nThe library provides several options. There are all described in the help menu, that you can display by running for instance:\r\n::\r\n\r\n\t$ python examples/Bambara/bambara.py -h\r\n\r\nCode warnings\r\n____________________\r\n\r\nWhile running your Python script, you may notice that lots of warning messages are generated by the library. Indeed, all values that are not defined in your configuration files or allowed by the MDF or LMF standards are reported, as part of speech and paradigm label values. Note that it does not block the script execution. The library also reports unresolved cross references and sound files that are not found.\r\n\r\nExecution errors\r\n______________________\r\n\r\nAny error will raise a Python exception, giving some details about the cause.", "description_content_type": null, "docs_url": "https://pythonhosted.org/pylmflib/", "download_url": "https://pypi.python.org/pypi/pylmflib/1.0", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/buret/pylmflib", "keywords": "", "license": "GPL", "maintainer": "S\u00e9verine Guillaume", "maintainer_email": "guillaume@vjf.cnrs.fr", "name": "pylmflib", "package_url": "https://pypi.org/project/pylmflib/", "platform": "ALL", "project_url": "https://pypi.org/project/pylmflib/", "project_urls": { "Download": "https://pypi.python.org/pypi/pylmflib/1.0", "Homepage": "https://github.com/buret/pylmflib" }, "release_url": "https://pypi.org/project/pylmflib/1.0/", "requires_dist": null, "requires_python": null, "summary": "Python LMF library", "version": "1.0" }, "last_serial": 1780086, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "1837df7cd19ffd25d10e24811e6ab9ef", "sha256": "8570df4a4357a979aff20197911590aea17388e66e80861781570f09ebf1054f" }, "downloads": -1, "filename": "pylmflib-1.0.tar.gz", "has_sig": false, "md5_digest": "1837df7cd19ffd25d10e24811e6ab9ef", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3954230, "upload_time": "2015-10-21T14:47:57", "url": "https://files.pythonhosted.org/packages/d2/81/5fe25adc3767538611088b2e84b611fc9a2e8e5a481e72dbfb565f1c76d2/pylmflib-1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1837df7cd19ffd25d10e24811e6ab9ef", "sha256": "8570df4a4357a979aff20197911590aea17388e66e80861781570f09ebf1054f" }, "downloads": -1, "filename": "pylmflib-1.0.tar.gz", "has_sig": false, "md5_digest": "1837df7cd19ffd25d10e24811e6ab9ef", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3954230, "upload_time": "2015-10-21T14:47:57", "url": "https://files.pythonhosted.org/packages/d2/81/5fe25adc3767538611088b2e84b611fc9a2e8e5a481e72dbfb565f1c76d2/pylmflib-1.0.tar.gz" } ] }