{ "info": { "author": "Red Balloon Security", "author_email": "quack-tech@redballoonsecurity.com", "bugtrack_url": null, "classifiers": [], "description": "# Binary Abstraction Layer (BAL)\n\nThe Binary Abstraction Layer (BAL) package is a tiny framework for analyzing and manipulating \nbinary data. \nIts guiding principle is that a tree is a natural representation for binary data. \nFor example a firmware may look as follow:\n\n - Zip Data\n - ELF\n - Header\n - Code\n - Data\n - Images\n - Config\n\nIt defines 3 broad categories of operations on the tree: **convert**, **analyze** and **modify**.\n\n - *Converters* handle serializing and deserializing binary data.\n - *Analyzers* handle extracting information from the tree representation.\n - *Modifiers* handle arbitrary modification of the binary.\n\n## Installation\n\nThe BAL package can be installed from PyPi with the following command:\n```\npip install bal\n```\n\nTo install the BAL module from the repository, clone the repo and run:\n\n```\npip install .\n```\n\nTo install the BAL package and generate a local copy of its documentation, run:\n\n```\npip install .[docs]\nmake html-docs\n```\n\nTo install the core BAL module as well as dependencies for the example, run:\n\n```\npip install .[examples]\n```\n\n## Concepts\n\nEach node in the tree is represented as a `DataObject`. \nA `DataObject` can wrap either an unstructured string of raw binary data or a `DataModel` (or both). \nA `DataModel` is an abstract class defining some sort of structured data. \nThe `DataModel` is created when deserializing raw binary data. \nIt fits the typical definition of data model.\n\n\nIn addition, the BAL framework defines a few interfaces:\n\n - `bal.context_ioc.AbstractConverter` A converter takes care of unpacking bytes into a `DataModel` \n (i.e. deserializing) and packing its `DataModel` into bytes (i.e. serializing). Its method \n signatures are inflexible so that they may be called directly by the `DataObject`.\n - `bal.context_ioc.AbstractModifier` A modifier updates the content of any node within the tree.\n It may modify the packed or unpacked data. It contains a single `modify()` method with an \n undefined signature. It may walk the entire tree, unpacking on the way.\n - `bal.context_ioc.AbstractAnalyzer` An analyzer extracts data from a tree. The type of \n the returned data is defined by the concrete analyzer implementation. It contains a single \n `analyze()` method with an undefined signature. It may walk the entire tree, unpacking on the way.\n - `bal.context_ioc.BALIocContext` The IoC context provides a simple implementation of the \n [Inversion of Control](https://en.wikipedia.org/wiki/Inversion_of_control) pattern. It looks up\n the implementation of a given interface and returns a new instance. It is used to instantiate \n an `AbstractConverter`, `AbstractModifier` or `AbstractAnalyzer`. \n - For `AbstractModifier`s and `AbstractAnalyzer`s, an interface extending the `AbstractModifier` or `AbstractAnalyzer` is supplied and an implementation of the interface is returned.\n - For `AbstractConverters`, an interface extending `DataModel` is supplied and an `AbstractConverter` implementation will be returned. This implementation's `pack()` method will create an instance of the supplied `DataModel` interface, and its `unpack()` method will serialize an instance of the supplied `DataModel` interface.\n- `bal.context_ioc.BALIoCContextFactory` Creates a configured instance of the `BALIoCContext`. It\n provides methods for the user to register the implementation of interfaces.\n- `bal.context.BALContext` A new context is created for each tree. It inherits from the \n `BALIoCContext`. The context contains a reference to the root `DataObject`. It may be used as \n a cache for analyzers that are either expensive or frequently called. It may also be used to \n store data that does not fit cleanly into a tree (for example relationships between unrelated \n nodes). \n - `bal.context.BALContextFactory` As implied, it is responsible for creating a `BALContext`. The\n factory is a good place to load external configuration that will be passed to the context. In \n most settings, the factory would be created when the application starts and destroyed when it \n dies.\n - `bal.context.BALManager` The BAL manager offers a way to look up factories using a key. It is \n not strictly necessary, and should only be used in applications that need to dynamically retrieve multiple different \n context factories\n\nThe full documentation for the API is available on \n [github.io](https://ballon-rouge.github.io/bal/)\n\n\n\n## Guide\n\nAll the code for this guide is contained in the [./example](./example) folder.\n\nThe first step is to declare a new `DataModel` class that defines the data structure for the root \nnode and its children.\nFor example, a Xilinx bitstream has 3 children: the header, a sync marker and the config packets.\nThe format of the header is not known, the sync marker does not have a format and the packets \nare an array of unknown data.\n\n```python\nclass XilinxPacketsInterface(DataModel):\n \"\"\"\n An array of Xilinx register configuration packets.\n \"\"\"\n\n\nclass XilinxBitstreamHeaderInterface(DataModel):\n \"\"\"\n The Xilinx bitstream header contains unknown information.\n \"\"\"\n\n\nclass XilinxBitstreamSyncMarkerInterface(DataModel):\n \"\"\"\n The Xilinx bitstream sync marker\n \"\"\"\n\n\nclass XilinxBitstream(ClassModel[DataObject]):\n \"\"\"\n The root model for a Xilinx bitstream. It contains data objects for a header, sync marker, and packets.\n \"\"\"\n\n def __init__(\n self,\n header,\n sync_marker,\n packets\n ):\n \"\"\"\n :param DataObject[XilinxBitstreamHeaderInterface] header:\n :param DataObject[XilinxBitstreamSyncMarker] sync_marker:\n :param DataObject[XilinxPackets] packets:\n \"\"\"\n super(XilinxBitstream, self).__init__((\n (\"header\", self.get_header),\n (\"sync_marker\", self.get_sync_marker),\n (\"packets\", self.get_packets),\n ))\n self.header = header\n self.sync_marker = sync_marker\n self.packets = packets\n\n def get_header(self):\n return self.header\n\n def get_sync_marker(self):\n return self.sync_marker\n\n def get_packets(self):\n return self.packets\n```\n\nIt is important to notice that even though the structure of the children is unknown, an interface\n is still created for them. As we will see later, it allows an external developer to later define\n their format as well as their converters.\n\nNow that we have the models, we are ready to create the root converter:\n\n```python\nclass XilinxBitstreamConverter(AbstractConverter):\n \"\"\"\n Converter for a Xilinx FPGA bitstream\n\n :param BALContext context: The BAL context.\n \"\"\"\n\n def __init__(self, context):\n super(XilinxBitstreamConverter, self).__init__(context)\n self.context = context\n\n def unpack(self, data_bytes):\n sync_marker = self.context.format.sync_word\n sync_marker_index = data_bytes.find(sync_marker)\n assert sync_marker_index >= 0, \\\n \"The sync marker is not present in the provided bitstream data\"\n\n assert sync_marker_index + len(sync_marker) < len(data_bytes) - 2, \\\n \"The configuration data is expected to contain at least one word size worth of data\"\n\n return XilinxBitstream(\n DataObject.create_packed(\n self.context,\n data_bytes[:sync_marker_index],\n XilinxBitstreamHeaderInterface\n ),\n DataObject.create_packed(\n self.context,\n data_bytes[sync_marker_index:sync_marker_index+len(sync_marker)],\n XilinxBitstreamSyncMarkerInterface,\n ),\n DataObject.create_packed(\n self.context,\n data_bytes[sync_marker_index + len(sync_marker):],\n XilinxPacketsInterface,\n )\n )\n\n def pack(self, data_model):\n \"\"\"\n :param XilinxBitstream data_model:\n :rtype: bytes\n \"\"\"\n assert isinstance(data_model, XilinxBitstream)\n return b\"\".join([\n data_model.get_header().pack(),\n self.context.format.sync_word,\n data_model.get_packets().pack()\n ])\n```\n\nThis is already getting a bit more complicated. \nThe converter takes a `BALContext` as an argument which implies that a converter instance must be \ndedicated to a specific bitstream.\nThe `unpack()` method does not instantiate any of its children `DataModel`, it only creates a \n`DataObject` that wraps the packed data for that model. \nIt provides the `DataObject` with the interface of the wrapped data model.\nThe `DataObject` uses the interface to extract basic information about the packed data (i.e. type \nand description from the interface name and its docstring). \nIt uses the interface when it is unpacked as well, looking up a converter implementation for that \ninterface inside the `BALContext` (remember that it inherits from the `BALIoCContext`).\nThis is an important property as it allows the tree to be \"lazily\" unpacked. \nThe user controls exactly when a given child is unpacked (if it gets unpacked at all) which can \nlead to significantly better performances in many use cases.\n\nLast but not least, we need a `BALContext` and `BALFactoryContext` implementation:\n\n```python\n\nclass XilinxContext(BALContext):\n \"\"\"\n :param Dict[Type[DataModel],Type[AbstractConverter]] converters_by_type:\n :param Dict[Type[AnalyzerInterface],Type[AbstractAnalyzer]] analyzers_by_type:\n :param Dict[Type[ModifierInterface],Type[AbstractModifier]] modifiers_by_type:\n :param bytes bytes: The bytes making up the bitstream.\n \"\"\"\n def __init__(\n self,\n converters_by_type,\n analyzers_by_type,\n modifiers_by_type,\n bytes\n ):\n super(XilinxContext, self).__init__(\n converters_by_type,\n analyzers_by_type,\n modifiers_by_type\n )\n self._bitstream = DataObject.create_packed(self, bytes, XilinxBitstream)\n\n def get_data(self):\n \"\"\"\n :rtype: DataObject[XilinxBitstream]\n \"\"\"\n return self._bitstream\n\n\nclass XilinxContextFactory(BALContextFactory):\n def __init__(self):\n super(XilinxContextFactory, self).__init__()\n\n def create(self, data):\n \"\"\"\n :param bytes bytes: The bytes for the Xilinx FPGA bitstream\n :rtype: XilinxContext\n \"\"\"\n return XilinxContext(\n self._converters_by_type,\n self._analyzers_by_type,\n self._modifiers_by_type,\n data\n )\n```\n\nSince our Xilinx implementation is pretty limited, both the context and its factory are trivial. \n\nLet's see our implementation in action:\n\n```python\nimport wget\n\ncontext_factory = XilinxContextFactory()\n# Register the XilinxBitsreamConverter\ncontext_factory.register_converter(XilinxBitstream, XilinxBitstreamConverter)\nlx9_bin = wget.download('https://redballoonsecurity.com/files/JwfEU4veQSNFao8h/lx9.bin')\nwith open(lx9_bin, \"rb\") as f:\n data = f.read()\ncontext = context_factory.create(data)\nbitstream_object = context.get_data()\nprint(\"Bitstream object: {}\".format(bitstream_object))\nprint(\"Bitstream model type: {}\".format(bitstream_object.get_model_type()))\nprint(\"Bitstream model description: {}\".format(bitstream_object.get_model_description()))\n\nprint(\"\\nUNPACKING\\n\")\n\nbitstream_object.unpack()\nprint(\"Bitstream object: {}\".format(bitstream_object))\n\nprint(\"\\nHEADER\\n\")\n\nheader_object = bitstream_object.get_model().get_header()\nprint(\"Bitstream header object: {}\".format(header_object))\nprint(\"Bitstream header model type: {}\".format(header_object.get_model_type()))\nprint(\"Bitstream header model description: {}\".format(header_object.get_model_description()))\n```\n\nThis script should print:\n\n```\nBitstream object: PackedXilinxBitstream(340604)\nBitstream model type: XilinxBitstream\nBitstream model description: The root model for a Xilinx bitstream. It contains a header and packets data objects.\n\nUNPACKING\n\nBitstream object: XilinxBitstream({\n header: PackedXilinxBitstreamHeaderInterface(16), \n sync_marker: PackedXilinxBitstreamSyncMarkerInterface(4), \n packets: PackedXilinxPacketsInterface(340584), \n})\n\nHEADER\n\nBitstream header object: PackedXilinxBitstreamHeaderInterface(16)\nBitstream header model type: XilinxBitstreamHeaderInterface\nBitstream header model description: The Xilinx bitstream header contains unknown information.\n```\n\nAs you can see from the output, the BAL framework already has a bunch of information about the \nstructure of the bitstream. \nIt uses the docstring defined on the interfaces to pull a description of the data models, even if\nthey cannot be unpacked yet. \n\nThis is it for this guide. \nYour next steps might be to implement the XilinxPacketsInterface, XilinxBitstreamHeaderInterface, \nand XilinxBitstreamSyncMarkerInterface interfaces and implement their respective converters. \nIf you want to learn more about writing a full chain of converters, analyzers and modifiers, \nhead over to the [bal-xilinx](https://github.com/RedBalloonShenanigans/bal-xilinx) project.\n\n\n\n\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "bal", "package_url": "https://pypi.org/project/bal/", "platform": "", "project_url": "https://pypi.org/project/bal/", "project_urls": null, "release_url": "https://pypi.org/project/bal/0.1/", "requires_dist": [ "six", "enum34 ; python_version < \"3.4\"", "typing ; python_version < \"3.5\"", "sphinx ; extra == 'docs'", "sphinx-rtd-theme ; extra == 'docs'", "m2r ; extra == 'docs'", "wget ; extra == 'examples'" ], "requires_python": "", "summary": "A framework for analyzing and manipulating binary data", "version": "0.1" }, "last_serial": 5652890, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "3acbafaff4ccabfe9859ea7efeff0412", "sha256": "a830bef47df8459cb5e8411c38ccc814d3ff66403f8848caa32d41e9fad119a0" }, "downloads": -1, "filename": "bal-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "3acbafaff4ccabfe9859ea7efeff0412", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 14703, "upload_time": "2019-08-09T01:00:19", "url": "https://files.pythonhosted.org/packages/3f/f6/a006f378319cebc6b4bb2d506e152e1b1a794edbe0b4db425623805b8eee/bal-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9424ccca358db6fa2b58236615cc90e8", "sha256": "ef2f265f7572bf1af0bfcbddb16dfa4a5927b1819ed2419c860de55d571251ab" }, "downloads": -1, "filename": "bal-0.1.tar.gz", "has_sig": false, "md5_digest": "9424ccca358db6fa2b58236615cc90e8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16853, "upload_time": "2019-08-09T01:00:22", "url": "https://files.pythonhosted.org/packages/4a/5c/5d0c4d43150a15b755bcbed3dce3254a7f8d38f9ff8c764fa00c80279b7f/bal-0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3acbafaff4ccabfe9859ea7efeff0412", "sha256": "a830bef47df8459cb5e8411c38ccc814d3ff66403f8848caa32d41e9fad119a0" }, "downloads": -1, "filename": "bal-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "3acbafaff4ccabfe9859ea7efeff0412", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 14703, "upload_time": "2019-08-09T01:00:19", "url": "https://files.pythonhosted.org/packages/3f/f6/a006f378319cebc6b4bb2d506e152e1b1a794edbe0b4db425623805b8eee/bal-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9424ccca358db6fa2b58236615cc90e8", "sha256": "ef2f265f7572bf1af0bfcbddb16dfa4a5927b1819ed2419c860de55d571251ab" }, "downloads": -1, "filename": "bal-0.1.tar.gz", "has_sig": false, "md5_digest": "9424ccca358db6fa2b58236615cc90e8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16853, "upload_time": "2019-08-09T01:00:22", "url": "https://files.pythonhosted.org/packages/4a/5c/5d0c4d43150a15b755bcbed3dce3254a7f8d38f9ff8c764fa00c80279b7f/bal-0.1.tar.gz" } ] }