{
"info": {
"author": "Jim Pivarski",
"author_email": "pivarski@princeton.edu",
"bugtrack_url": null,
"classifiers": [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"Intended Audience :: Information Technology",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: BSD License",
"Operating System :: POSIX :: Linux",
"Programming Language :: Python",
"Programming Language :: Python :: 2.7",
"Programming Language :: Python :: 3.5",
"Programming Language :: Python :: 3.6",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Topic :: Scientific/Engineering",
"Topic :: Scientific/Engineering :: Information Analysis",
"Topic :: Scientific/Engineering :: Mathematics",
"Topic :: Scientific/Engineering :: Physics",
"Topic :: Software Development",
"Topic :: Utilities"
],
"description": "
\n\nAwkward Array is a library for **nested, variable-sized data**, including arbitrary-length lists, records, mixed types, and missing data, using **NumPy-like idioms**.\n\nArrays are **dynamically typed**, but operations on them are **compiled and fast**. Their behavior coincides with NumPy when array dimensions are regular and generalizes when they're not.\n\n# Motivating example\n\nGiven an array of objects with `x`, `y` fields and variable-length nested lists like\n\n```python\narray = ak.Array([\n [{\"x\": 1.1, \"y\": [1]}, {\"x\": 2.2, \"y\": [1, 2]}, {\"x\": 3.3, \"y\": [1, 2, 3]}],\n [],\n [{\"x\": 4.4, \"y\": {1, 2, 3, 4]}, {\"x\": 5.5, \"y\": [1, 2, 3, 4, 5]}]\n])\n```\n\nthe following slices out the `y` values, drops the first element from each inner list, and runs NumPy's `np.square` function on everything that is left:\n\n```python\noutput = np.square(array[\"y\", ..., 1:])\n```\n\nThe result is\n\n```python\n[\n [[], [4], [4, 9]],\n [],\n [[4, 9, 16], [4, 9, 16, 25]]\n]\n```\n\nThe equivalent using only Python is\n\n```python\noutput = []\nfor sublist in array:\n tmp1 = []\n for record in sublist:\n tmp2 = []\n for number in record[\"y\"][1:]:\n tmp2.append(np.square(number))\n tmp1.append(tmp2)\n output.append(tmp1)\n```\n\nNot only is the expression using Awkward Arrays more concise, using idioms familiar from NumPy, but it's much faster and uses less memory.\n\nFor a similar problem 10 million times larger than the one above (on a single-threaded 2.2 GHz processor),\n\n * the Awkward Array one-liner takes **4.6 seconds** to run and uses **2.1 GB** of memory,\n * the equivalent using Python lists and dicts takes **138 seconds** to run and uses **22 GB** of memory.\n\nSpeed and memory factors in the double digits are common because we're replacing Python's dynamically typed, pointer-chasing virtual machine with type-specialized, precompiled routines on contiguous data. (In other words, for the same reasons as NumPy.) Even higher speedups are possible when Awkward Array is paired with [Numba](https://numba.pydata.org/).\n\nOur [presentation at SciPy 2020](https://youtu.be/WlnUF3LRBj4) provides a good introduction, showing how to use these arrays in a real analysis.\n\n# Installation\n\nAwkward Array can be installed [from PyPI](https://pypi.org/project/awkward) using pip:\n\n```bash\npip install awkward\n```\n\nYou will likely get a precompiled binary (wheel), depending on your operating system and Python version. If not, pip attempts to compile from source (which requires a C++ compiler, make, and CMake).\n\nAwkward Array is also available using [conda](https://anaconda.org/conda-forge/awkward), which always installs a binary:\n```bash\nconda install -c conda-forge awkward\n```\n\nIf you have already added `conda-forge` as a channel, the `-c conda-forge` is unnecessary. Adding the channel is recommended because it ensures that all of your packages use compatible versions:\n\n```bash\nconda config --add channels conda-forge\nconda update --all\n```\n\n## Getting help\n\n
\n \n \n \n \n \n How-to tutorials\n \n \n | \n \n \n \n \n \n \n Python API reference\n \n \n \n \n \n \n \n C++ API reference\n \n \n | \n