{ "info": { "author": "Rico Sennrich", "author_email": "sennrich@cl.uzh.ch", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "License :: OSI Approved :: GNU General Public License v2 (GPLv2)", "Operating System :: OS Independent", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.2", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Information Analysis", "Topic :: Text Processing", "Topic :: Text Processing :: Linguistic" ], "description": "Bleualign\n=========\nAn MT-based sentence alignment tool\n\nCopyright \u24d2 2010\nRico Sennrich \n\nA project of the Computational Linguistics Group at the University of Zurich (http://www.cl.uzh.ch).\n\nProject Homepage: http://github.com/rsennrich/bleualign\n\nThis program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation\n\nGENERAL INFO\n------------\n\nBleualign is a tool to align parallel texts (i.e. a text and its translation) on a sentence level.\nAdditionally to the source and target text, Bleualign requires an automatic translation of at least one of the texts.\nThe alignment is then performed on the basis of the similarity (modified BLEU score) between the source text sentences (translated into the target language) and the target text sentences.\nSee section PUBLICATIONS for more details.\n\nObtaining an automatic translation is up to the user. The only requirement is that the translation must correspond line-by-line to the source text (no line breaks inserted or removed).\n\nREQUIREMENTS\n------------\n\nThe software was developed on Linux using Python 2.6, but should also support newer versions of Python (including 3.X) and other platforms.\nPlease report any issues you encounter to sennrich@cl.uzh.ch\n\n\nUSAGE INSTRUCTIONS\n------------------\n\nThe input and output formats of bleualign are one sentence per line.\nA line which only contains .EOA is considered a hard delimiter (end of article).\nSentence alignment does not cross these delimiters: reliable delimiters improve speed and performance, wrong ones will seriously degrade performance.\n\nGiven the files sourcetext.txt, targettext.txt and sourcetranslation.txt (the latter being sentence-aligned with sourcetext.txt), a sample call is\n\n ./bleualign.py -s sourcetext.txt -t targettext.txt --srctotarget sourcetranslation.txt -o outputfile\n\nIt is also possible to provide several translations and/or translations in the other translation direction.\nbleualign will run once per translation provided, the final output being the intersection of the individual runs (i.e. sentence pairs produced in each individual run).\n\n ./bleualign.py -s sourcetext.txt -t targettext.txt --srctotarget sourcetranslation1.txt --srctotarget sourcetranslation2.txt --targettosrc targettranslation1.txt -o outputfile\n\n ./bleualign.py -h will show more usage options\n\nTo facilitate batch processing multiple files, `batch_align.py` can be used.\n\n python batch_align directory source_suffix target_suffix translation_suffix\n\nexample: given the directory `raw_files` with the files `0.de`, `0.fr` and `0.trans` and so on, (`0.trans` being the translation of `0.de` into the target language), then this command will align all files: \n\n python batch_align.py raw_files de fr trans\n\nThis will produce the files `0.de.aligned` and `0.fr.aligned`\n\nUSAGE AS PYTHON MODULE\n----------------------\n\nBleualign works as stand-alone script, but can also be imported as a module other Python projects.\nFor code examples, see the example/ directory. If you want to know all options, you can see Aligner.default_options variable in bleualign/aligner.py.\n\nTo use Bleualign as a Python module, the package needs to be installed (from a local copy) with:\n\n python setup.py install\n\nThe Bleualign package can also be installed directly from Github with:\n\n pip install git+https://github.com/rsennrich/Bleualign.git\n\nPUBLICATIONS\n------------\n\nThe algorithm is described in\n\nRico Sennrich, Martin Volk (2010):\n MT-based Sentence Alignment for OCR-generated Parallel Texts. In: Proceedings of AMTA 2010, Denver, Colorado.\n\nRico Sennrich; Martin Volk (2011): \n Iterative, MT-based sentence alignment of parallel texts. In: NODALIDA 2011, Nordic Conference of Computational Linguistics, Riga.\n\n\nCONTACT\n-------\n\nFor questions and feeback, please contact sennrich@cl.uzh.ch or use the GitHub repository.", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/rsennrich/Bleualign", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/rsennrich/Bleualign", "keywords": "Sentence Alignment,Natural Language Processing,Statistical Machine Translation,BLEU", "license": "", "maintainer": "", "maintainer_email": "", "name": "pypi-bleualign", "package_url": "https://pypi.org/project/pypi-bleualign/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/pypi-bleualign/", "project_urls": { "Download": "https://github.com/rsennrich/Bleualign", "Homepage": "https://github.com/rsennrich/Bleualign" }, "release_url": "https://pypi.org/project/pypi-bleualign/0.1.1.1/", "requires_dist": null, "requires_python": "", "summary": "An MT-based sentence alignment tool", "version": "0.1.1.1" }, "last_serial": 1903626, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "3ccde3aa980f380a8f42da55b93cf164", "sha256": "b8a03fe86f40f1a705c983c218529642d2bb6fa131feaa5a51084930183784e9" }, "downloads": -1, "filename": "pypi-bleualign-0.1.1.tar.gz", "has_sig": false, "md5_digest": "3ccde3aa980f380a8f42da55b93cf164", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21409, "upload_time": "2016-01-14T02:17:48", "url": "https://files.pythonhosted.org/packages/e9/d1/ba42874dadd78b81e1ab8dbae5e62e60f601f511e4fda7b7522747653cb7/pypi-bleualign-0.1.1.tar.gz" } ], "0.1.1.1": [ { "comment_text": "", "digests": { "md5": "319bcdc7d6869f7e71293dd52d055e80", "sha256": "f7e9b8696ccd74734dd09d53893ee7cb4480c06c5e8ee1184898355f1b8987d1" }, "downloads": -1, "filename": "pypi-bleualign-0.1.1.1.tar.gz", "has_sig": false, "md5_digest": "319bcdc7d6869f7e71293dd52d055e80", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21639, "upload_time": "2016-01-14T03:36:18", "url": "https://files.pythonhosted.org/packages/ff/de/d3d868a2e873e7660d77b97af43d95ed34f5edb6b8e8d233e2e3770ceb11/pypi-bleualign-0.1.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "319bcdc7d6869f7e71293dd52d055e80", "sha256": "f7e9b8696ccd74734dd09d53893ee7cb4480c06c5e8ee1184898355f1b8987d1" }, "downloads": -1, "filename": "pypi-bleualign-0.1.1.1.tar.gz", "has_sig": false, "md5_digest": "319bcdc7d6869f7e71293dd52d055e80", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21639, "upload_time": "2016-01-14T03:36:18", "url": "https://files.pythonhosted.org/packages/ff/de/d3d868a2e873e7660d77b97af43d95ed34f5edb6b8e8d233e2e3770ceb11/pypi-bleualign-0.1.1.1.tar.gz" } ] }