{ "info": { "author": "Jan Laukemann", "author_email": "jan.laukemann@fau.de", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU Affero General Public License v3", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Topic :: Scientific/Engineering", "Topic :: Software Development", "Topic :: Utilities" ], "description": ".. image:: doc/osaca-logo.png\n :alt: OSACA logo\n :width: 80%\n\nOSACA\n=====\n\nOpen Source Architecture Code Analyzer\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThis tool allows automatic instruction fetching of assembly code,\nauto-generating of testcases for assembly instructions creating latency\nand throughput benchmarks on a specific instruction form and throughput\nanalysis and throughput prediction for a innermost loop kernel.\n\n.. image:: https://travis-ci.com/RRZE-HPC/OSACA.svg?token=393L6z2HEXNiGLtZ43s6&branch=master\n :target: https://travis-ci.com/RRZE-HPC/OSACA\n\n.. image:: https://landscape.io/github/RRZE-HPC/OSACA/master/landscape.svg?style=flat&badge_auth_token=c95f01b247f94bc79c09d21c5c827697\n :target: https://landscape.io/github/RRZE-HPC/OSACA/master\n :alt: Code Health\n\nGetting started\n===============\n\nInstallation\n~~~~~~~~~~~~\nOn most systems with python pip and setuputils installed, just run:\n\n.. code:: bash\n\n pip install --user osaca\n\nfor the latest release.\n\nTo build OSACA from source, clone this repository using ``git clone https://github.com/RRZE-HPC/OSACA`` and run in the root directory:\n\n.. code:: bash\n\n python ./setup.py install\n\nAfter installation, OSACA can be started with the command ``osaca`` in the CLI.\n\nDependencies:\n~~~~~~~~~~~~~~~\nAdditional requirements are:\n\n- `Python3 `_\n- `pandas `_\n- `NumPy `_\n- `Kerncraft `_ for marker insertion\n- `ibench `_ for throughput/latency measurements\n\nDesign\n======\nA schematic design of OSACA's workflow is shown below:\n\n.. image:: doc/osaca-workflow.png\n :alt: OSACA workflow\n :width: 80%\n\nUsage\n=====\n\nThe usage of OSACA can be listed as:\n\n.. code:: bash\n\n osaca [-h] [-V] [--arch ARCH] [--tp-list] [-i | --iaca | -m] FILEPATH\n\n- ``-h`` or ``--help`` prints out the help message.\n- ``-V`` or ``--version`` shows the program\u2019s version number.\n- ``ARCH`` needs to be replaced with the wished architecture abbreviation. This flag is necessary for the throughput analysis (default function) and the inclusion of an ibench output (``-i``). Possible options are ``SNB``, ``IVB``, ``HSW``, ``BDW`` and ``SKL`` for the latest Intel micro architectures starting from Intel Sandy Bridge and ``ZEN`` for AMD Zen (17h family) architecture .\n- While in the throughput analysis mode, one can add ``--tp-list`` for printing the additional throughput list of the kernel or ``--iaca`` for letting OSACA to know it has to search for IACA binary markers.\n- ``-i`` or ``--include-ibench`` starts the integration of ibench output into the CSV data file determined by ``ARCH``.\n- With the flag ``-m`` or ``--insert-marker`` OSACA calls the Kerncraft module for the interactively insertion of `IACA `_ marker in suggested assembly blocks.\n- ``FILEPATH`` describes the filepath to the file to work with and is always necessary\n\nHereinafter OSACA's scope of function will be described.\n\nThroughput analysis\n~~~~~~~~~~~~~~~~~~~\nAs main functionality of OSACA this process starts by default. It is always necessary to specify the core architecture by the flag ``--arch ARCH``, where ``ARCH`` can stand for ``SNB``, ``IVB``, ``HSW``, ``BDW``, ``SKL`` or ``ZEN``.\n\nFor extracting the right kernel, one has to mark it beforehand. For this there are two different approaches:\n\n| **High level code**\n\nThe OSACA marker is ``//STARTLOOP`` and must be put in one line in front of the loop head, and the loop code must be indented consistently. This means the marker and the head must have the same indentation level while the whole loop body needs to be more indented than the code before and after. For instance, this is a valid OSACA marker:\n\n.. code-block:: c\n\n int i = 0;\n //STARTLOOP\n while(i < N){\n // do something...\n i++;\n }\n\n| **Assembly code**\n\nAnother way for marking a kernel is to insert the IACA byte markers in the assembly file in before and after the loop.\nFor this, the start marker has to be inserted right in front of the loop label and the end marker directly after the jump instruction.\nStart and end marker can be seen in the example below:\n\n.. code-block:: gas\n\n movl $111,%ebx ;IACA START MARKER\n .byte 100,103,144 ;IACA START MARKER\n ; LABEL\n ; do something\n ; ...\n ; conditional jump to LABEL\n movl $222,%ebx ;IACA END MARKER\n .byte 100,103,144 ;IACA END MARKER\n\nThe optional flag ``--iaca`` defines if OSACA needs to search for the IACA byte markers or the OSACA marker in the chosen file.\n\nWith an additional, optional ``--tp-list``, OSACA adds a simple list of all kernel instruction forms together with their reciprocal throughput to the output. This is helpful in case of no further information about the port binding of the single instruction forms.\n\nInclude new measurements into the data file\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nRunning OSACA with the flag ``-i`` or ``--include-ibench`` and a specified micro architecture ``ARCH``, it\ntakes the values given in an ibench output file and checks them for reasonability. If a value is not in the data file already, it will be added, otherwise OSACA prints out a warning message and keeps the old value in the data file. If a value does not pass the validation, a warning message is shown, however, OSACA will keep working with the new value.\nThe handling of ibench is shortly described in the example section below.\n\nInsert IACA markers\n~~~~~~~~~~~~~~~~~~~\nUsing the ``-m`` or ``--insert-marker`` flags for a given file, OSACA calls the implemented Kerncraft module for identifying and marking the inner-loop block in *manual mode*. More information about how this is done can be found in the `Kerncraft repository `_.\n\nExample\n=======\nFor clarifying the functionality of OSACA a sample kernel is analyzed for an Intel IVB core hereafter:\n\n.. code-block:: c\n\n double a[N], double b[N];\n double s;\n\n //STARTLOOP\n for(int i = 0; i < N; ++i)\n a[i] = s * b[i];\n\nThe code shows a simple scalar multiplication of a vector ``b`` and a floating-point number ``s``. The result is\nwritten in vector ``a``.\nAfter including the OSACA marker ``//STARTLOOP`` and compiling the source, one can\nstart the analysis typing \n\n.. code:: bash\n\n osaca --arch IVB PATH/TO/FILE\n\nin the command line. Optionally, one can create the assembly code out of the file, identify and mark the kernel of interest and run OSACA with the additional ``--iaca`` flag.\n\nThe output is:\n\n.. code-block::\n\n Throughput Analysis Report\n --------------------------\n X - No information for this instruction in database\n * - Instruction micro-ops not bound to a port\n\n Port Binding in Cycles Per Iteration:\n -------------------------------------------------\n | Port | 0 | 1 | 2 | 3 | 4 | 5 |\n -------------------------------------------------\n | Cycles | 2.33 | 1.33 | 5.0 | 5.0 | 2.0 | 1.33 |\n -------------------------------------------------\n\n Ports Pressure in cycles \n | 0 | 1 | 2 | 3 | 4 | 5 |\n -------------------------------------------\n | | | 0.50 | 0.50 | 1.00 | | movl $0x0,-0x24(%rbp)\n | | | | | | | jmp 10b \n | | | 0.50 | 0.50 | | | mov -0x48(%rbp),%rax\n | | | 0.50 | 0.50 | | | mov -0x24(%rbp),%edx\n | 0.33 | 0.33 | | | | 0.33 | movslq %edx,%rdx\n | | | 0.50 | 0.50 | | | vmovsd (%rax,%rdx,8),%xmm0\n | 1.00 | | 0.50 | 0.50 | | | vmulsd -0x50(%rbp),%xmm0,%xmm0\n | | | 0.50 | 0.50 | | | mov -0x38(%rbp),%rax\n | | | 0.50 | 0.50 | | | mov -0x24(%rbp),%edx\n | 0.33 | 0.33 | | | | 0.33 | movslq %edx,%rdx\n | | | 0.50 | 0.50 | 1.00 | | vmovsd %xmm0,(%rax,%rdx,8)\n | | | | | | | X addl $0x1,-0x24(%rbp)\n | | | 0.50 | 0.50 | | | mov -0x24(%rbp),%eax\n | 0.33 | 0.33 | 0.50 | 0.50 | | 0.33 | cmp -0x54(%rbp),%eax\n | | | | | | | jl e4 \n | 0.33 | 0.33 | | | | 0.33 | mov %rcx,%rsp\n Total number of estimated throughput: 5.0\n\nIt shows the whole kernel together with the average port pressure of each instruction form and the overall port binding.\nIn the fifth to last line containing ``addl $0x1, -0x24(%rbp)`` one can see an ``X`` in front of the instruction form and no port occupation.\nThis means either there are no measured values for this instruction form or no port binding is provided in the\ndata file.\nIn the first case, OSACA automatically creates two benchmark assembly files (``add-mem_imd.S`` for latency and ``add-mem_imd-TP.S`` for throughput) in the benchmark folder, if it not already exists there.\n\nOne can now run ibench to get the throughput value for addl with the given file. Mind that the assembly\nfile, which is used for ibench, is implemented in Intel syntax. So for a valid run instruction ``addl`` must be\nchanged to ``add`` manually.\n\nFor measuring the instruction forms with ibench we highly recommend to use an exclusively allocated node,\nso there is no other workload falsifying the results. For the correct function of ibench the benchmark files\nfrom OSACA need to be placed in a subdirectory of src in root so ibench can create the a folder with the\nsubdirectory\u2019s name and the shared objects. For running the tests the frequencies of all cores must set to a\nconstant value and this has to be given as an argument together with the directory of the shared objects to\nibench, e.g.:\n\n.. code:: bash\n\n ./ibench ./AVX 2.2\n\nfor running ibench in the directory ``AVX`` with a core frequency of 2.2 GHz.\nWe get an output like:\n\n.. code:: bash\n\n Using frequency 2.20GHz.\n add-mem_imd-TP: 1.023 (clock cycles) [DEBUG - result: 1.000000]\n add-mem_imd: 6.050 (clock cycles) [DEBUG - result: 1.000000]\n\nThe debug output as resulting value of register ``xmm0`` is additional validation information depending on\nthe executed instruction form meant for the user and is not considered by OSACA.\nThe ibench output information can be included by OSACA running the program with the flag ``--include-ibench`` or just\n``-i`` and the specify micro architecture:\n\n.. code-block:: bash\n\n osaca --arch IVB -i PATH/TO/IBENCH-OUTPUTFILE\n\nFor now no automatic allocation of ports for a instruction form is implemented, so for getting an output in the Ports Pressure table, one must add the port occupation by hand.\nWe know that the inserted instruction form must be assigned always to Port 2, 3 and 4 and additionally to either 0, 1 or 5, a valid data file therefore would look like this:\n\n.. code:: bash\n\n addl-mem_imd,1.0,6.0,\"(0.33,0.33,1.00,1.00,1.00,0.33)\"\n\nAnother thorughput analysis with OSACA now returns all information for the kernel:\n\n.. code-block::\n\n Throughput Analysis Report\n --------------------------\n X - No information for this instruction in database\n * - Instruction micro-ops not bound to a port\n\n Port Binding in Cycles Per Iteration:\n -------------------------------------------------\n | Port | 0 | 1 | 2 | 3 | 4 | 5 |\n -------------------------------------------------\n | Cycles | 2.67 | 1.67 | 6.0 | 6.0 | 3.0 | 1.67 |\n -------------------------------------------------\n\n Ports Pressure in cycles \n | 0 | 1 | 2 | 3 | 4 | 5 |\n -------------------------------------------\n | | | 0.50 | 0.50 | 1.00 | | movl $0x0,-0x24(%rbp)\n | | | | | | | jmp 10b \n | | | 0.50 | 0.50 | | | mov -0x48(%rbp),%rax\n | | | 0.50 | 0.50 | | | mov -0x24(%rbp),%edx\n | 0.33 | 0.33 | | | | 0.33 | movslq %edx,%rdx\n | | | 0.50 | 0.50 | | | vmovsd (%rax,%rdx,8),%xmm0\n | 1.00 | | 0.50 | 0.50 | | | vmulsd -0x50(%rbp),%xmm0,%xmm0\n | | | 0.50 | 0.50 | | | mov -0x38(%rbp),%rax\n | | | 0.50 | 0.50 | | | mov -0x24(%rbp),%edx\n | 0.33 | 0.33 | | | | 0.33 | movslq %edx,%rdx\n | | | 0.50 | 0.50 | 1.00 | | vmovsd %xmm0,(%rax,%rdx,8)\n | 0.33 | 0.33 | 1.00 | 1.00 | 1.00 | 0.33 | addl $0x1,-0x24(%rbp)\n | | | 0.50 | 0.50 | | | mov -0x24(%rbp),%eax\n | 0.33 | 0.33 | 0.50 | 0.50 | | 0.33 | cmp -0x54(%rbp),%eax\n | | | | | | | jl e4 \n | 0.33 | 0.33 | | | | 0.33 | mov %rcx,%rsp\n Total number of estimated throughput: 6.0\n\nCredits\n=======\nImplementation: Jan Laukemann\n\nLicense\n=======\n`AGPL-3.0 `_\n\n\n", "description_content_type": "text/x-rst", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/RRZE-HPC/OSACA", "keywords": "hpc performance benchmark analysis architecture", "license": "AGPLv3", "maintainer": "", "maintainer_email": "", "name": "osaca", "package_url": "https://pypi.org/project/osaca/", "platform": "", "project_url": "https://pypi.org/project/osaca/", "project_urls": { "Homepage": "https://github.com/RRZE-HPC/OSACA" }, "release_url": "https://pypi.org/project/osaca/0.2.2/", "requires_dist": [ "numpy", "pandas", "kerncraft" ], "requires_python": ">=3.5", "summary": "Open Source Architecture Code Analyzer", "version": "0.2.2" }, "last_serial": 5982084, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "bfc2e081b1baea09e277fe48ca435393", "sha256": "af25ed1559fbcabdca25258306535fc20a61d9294ed3f6ef04330a037768a66c" }, "downloads": -1, "filename": "osaca-0.1.tar.gz", "has_sig": false, "md5_digest": "bfc2e081b1baea09e277fe48ca435393", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2648980, "upload_time": "2018-01-24T08:17:47", "url": "https://files.pythonhosted.org/packages/f0/99/d8f31ae8f2d0f79257d6cfc036ded8d484c27b5534258b4232e0193e949e/osaca-0.1.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "3a32b2a9d4fd55f0f7348036f531879f", "sha256": "751fc7e8c4f974854dd928e104f3127fc56c13ecc5cfd1bd86fafaf805fd54a4" }, "downloads": -1, "filename": "osaca-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "3a32b2a9d4fd55f0f7348036f531879f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 45829, "upload_time": "2018-09-03T19:51:05", "url": "https://files.pythonhosted.org/packages/88/a5/2a81436fc6321f057e6d6a70369d0847d6c6ba48b9d70cbc16920846b288/osaca-0.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1f7d4929e6ace32bde066bbb204ceedb", "sha256": "de21c016a48629d59192170cfbd29ccf7353d14be4c7a0b41ef2843ab2fcb55f" }, "downloads": -1, "filename": "osaca-0.2.0.tar.gz", "has_sig": false, "md5_digest": "1f7d4929e6ace32bde066bbb204ceedb", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 76358, "upload_time": "2018-09-03T19:51:06", "url": "https://files.pythonhosted.org/packages/1d/22/a94998e6069447b386f9bd32a6dad3553a66279c50349feb089b5e9c9a37/osaca-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "d5563ecbd042c5c3f1c6ad68e56deadb", "sha256": "c8569d2d9446860b3cc29674daaa9382117f304b146023861fc5dc650db473ce" }, "downloads": -1, "filename": "osaca-0.2.1.tar.gz", "has_sig": false, "md5_digest": "d5563ecbd042c5c3f1c6ad68e56deadb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2649646, "upload_time": "2019-01-10T12:55:55", "url": "https://files.pythonhosted.org/packages/a3/b1/fbf0b923352e53214751e33cabc1cbd7a67aa58751d38dcf7c5569270840/osaca-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "ee6388ab8e7ef35abd91f1e419906758", "sha256": "62c7d46e4435d4e00ab0acf79de12b86c1db135fdd003b92a3fd58aaed783384" }, "downloads": -1, "filename": "osaca-0.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "ee6388ab8e7ef35abd91f1e419906758", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 228589, "upload_time": "2019-05-16T14:50:24", "url": "https://files.pythonhosted.org/packages/e3/0b/b558bef3592baf38d82825dedb6183d81160d9dcf6d58095ff7c0cc46ae8/osaca-0.2.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05d0da51b55271233b2e7fb4ecd1b0ad", "sha256": "da05a322938e64ab7b8451536a1c4e150f5b03f75ac234fb9607eec1e8472651" }, "downloads": -1, "filename": "osaca-0.2.2.tar.gz", "has_sig": false, "md5_digest": "05d0da51b55271233b2e7fb4ecd1b0ad", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 254235, "upload_time": "2019-05-16T14:50:26", "url": "https://files.pythonhosted.org/packages/81/08/56c5706373727bff12253a0bb267d131c981cf7d82ae4f5659c314f8c476/osaca-0.2.2.tar.gz" } ], "0.3.0.dev0": [ { "comment_text": "", "digests": { "md5": "ab0227e1764b2d6c52a843563f2117f8", "sha256": "cbcee6e3f12a16171575a28891aeeee7735c50ba262067c8043b357d33607ae8" }, "downloads": -1, "filename": "osaca-0.3.0.dev0-py3-none-any.whl", "has_sig": false, "md5_digest": "ab0227e1764b2d6c52a843563f2117f8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 259588, "upload_time": "2019-09-27T16:17:44", "url": "https://files.pythonhosted.org/packages/de/5f/5e665f93249ecb4eb73999ec03722384d7b39afc9e68eb26ac6b30fa8647/osaca-0.3.0.dev0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "95ce980aafb1f6e9201fbfa158ca69d9", "sha256": "5219ebf8f1a95939fba90c283e26d14fa20f7da5d103ab3a602c508659612c74" }, "downloads": -1, "filename": "osaca-0.3.0.dev0.tar.gz", "has_sig": false, "md5_digest": "95ce980aafb1f6e9201fbfa158ca69d9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 255746, "upload_time": "2019-09-27T16:17:47", "url": "https://files.pythonhosted.org/packages/33/9c/943f13b3077cde82c8c153e7da15a0d330b6f3d5e711f4056ba0b1fe0068/osaca-0.3.0.dev0.tar.gz" } ], "0.3.1.dev0": [ { "comment_text": "", "digests": { "md5": "efe2faf81ae14167e262cb431137455c", "sha256": "392ab407bac62d060932df31cbf23c87ad8f181ffef6a61941d0d0d433236830" }, "downloads": -1, "filename": "osaca-0.3.1.dev0-py3-none-any.whl", "has_sig": false, "md5_digest": "efe2faf81ae14167e262cb431137455c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 256082, "upload_time": "2019-10-04T00:15:55", "url": "https://files.pythonhosted.org/packages/ba/f3/cd38d9c4321b0c122fe669852ff2d298f0af9aef1eaa8bc0fd7a0454b091/osaca-0.3.1.dev0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f1f9253720f417a8c104484d2acd1a61", "sha256": "b084e4c0dfb769441214d4909810561922aeaedc70c9063d09878aeb31ad2b6b" }, "downloads": -1, "filename": "osaca-0.3.1.dev0.tar.gz", "has_sig": false, "md5_digest": "f1f9253720f417a8c104484d2acd1a61", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 62255, "upload_time": "2019-10-04T00:15:58", "url": "https://files.pythonhosted.org/packages/46/ea/d137d125bc423727d01f88c3c651bc5a651a85e86a30c596b34cc2ae43e9/osaca-0.3.1.dev0.tar.gz" } ], "0.3.1.dev1": [ { "comment_text": "", "digests": { "md5": "42c1d74ea88066e866f83f116021f611", "sha256": "ee21bf1eafce1094e7b63280d1bb7285f743efe6504122fe2217e4591323b5ec" }, "downloads": -1, "filename": "osaca-0.3.1.dev1.tar.gz", "has_sig": false, "md5_digest": "42c1d74ea88066e866f83f116021f611", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 62710, "upload_time": "2019-10-16T09:00:32", "url": "https://files.pythonhosted.org/packages/27/65/661f5d3885487ddc8c79c03a098a6cb4a440c0dd2b2f287117888acfff15/osaca-0.3.1.dev1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ee6388ab8e7ef35abd91f1e419906758", "sha256": "62c7d46e4435d4e00ab0acf79de12b86c1db135fdd003b92a3fd58aaed783384" }, "downloads": -1, "filename": "osaca-0.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "ee6388ab8e7ef35abd91f1e419906758", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 228589, "upload_time": "2019-05-16T14:50:24", "url": "https://files.pythonhosted.org/packages/e3/0b/b558bef3592baf38d82825dedb6183d81160d9dcf6d58095ff7c0cc46ae8/osaca-0.2.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05d0da51b55271233b2e7fb4ecd1b0ad", "sha256": "da05a322938e64ab7b8451536a1c4e150f5b03f75ac234fb9607eec1e8472651" }, "downloads": -1, "filename": "osaca-0.2.2.tar.gz", "has_sig": false, "md5_digest": "05d0da51b55271233b2e7fb4ecd1b0ad", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 254235, "upload_time": "2019-05-16T14:50:26", "url": "https://files.pythonhosted.org/packages/81/08/56c5706373727bff12253a0bb267d131c981cf7d82ae4f5659c314f8c476/osaca-0.2.2.tar.gz" } ] }