{ "info": { "author": "Jim Pivarski (IRIS-HEP)", "author_email": "pivarski@princeton.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Information Technology", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Operating System :: MacOS", "Operating System :: POSIX", "Operating System :: Unix", "Programming Language :: Python", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Information Analysis", "Topic :: Scientific/Engineering :: Mathematics", "Topic :: Scientific/Engineering :: Physics", "Topic :: Software Development", "Topic :: Utilities" ], "description": "![](https://github.com/scikit-hep/aghast/raw/master/docs/source/logo-300px.png)\n\n# aghast\n\n[![Build Status](https://travis-ci.org/scikit-hep/aghast.svg?branch=master)](https://travis-ci.org/scikit-hep/aghast) [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/scikit-hep/aghast/master?urlpath=lab/tree/binder%2Ftutorial.ipynb)\n\nAghast is a histogramming library that does not fill histograms and does not plot them. Its role is behind the scenes, to provide better communication between histogramming libraries.\n\nSpecifically, it is a structured representation of **ag**gregated, **h**istogram-like **st**atistics as sharable \"ghasts.\" It has all of the \"bells and whistles\" often associated with plain histograms, such as number of entries, unbinned mean and standard deviation, bin errors, associated fit functions, profile plots, and even simple ntuples (needed for unbinned fits or machine learning applications). [ROOT](https://root.cern.ch/root/htmldoc/guides/users-guide/Histograms.html) has all of these features; [Numpy](https://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html) has none of them.\n\nThe purpose of aghast is to be an intermediate when converting ROOT histograms into Numpy, or vice-versa, or both of these into [Boost.Histogram](https://github.com/boostorg/histogram), [Physt](https://physt.readthedocs.io/en/latest/index.html), [Pandas](https://pandas.pydata.org), etc. Without an intermediate representation, converting between _N_ libraries (to get the advantages of all) would equire _N(N \u2012 1)/2_ conversion routines; with an intermediate representation, we only need _N_, and the mapping of feature to feature can be made explicit in terms of a common language.\n\nFurthermore, aghast is a [Flatbuffers](http://google.github.io/flatbuffers/) schema, so it can be deciphered in [many languages](https://google.github.io/flatbuffers/flatbuffers_support.html), with [lazy, random-access](https://github.com/mzaks/FlatBuffersSwift/wiki/FlatBuffers-Explained), and uses a [small amount of memory](http://google.github.io/flatbuffers/md__benchmarks.html). A collection of histograms, functions, and ntuples can be shared among processes as shared memory, used in remote procedure calls, processed incrementally in a memory-mapped file, or saved in files with future-proof [schema evolution](https://google.github.io/flatbuffers/md__schemas.html).\n\n## Installation from packages\n\nInstall aghast like any other Python package:\n\n```bash\npip install aghast # maybe with sudo or --user, or in virtualenv\n```\n\n\n\n\n\n\n\n\n_(Not on conda yet.)_\n\n## Manual installation\n\nAfter you git-clone this GitHub repository and ensure that `numpy` is installed, somehow:\n\n```bash\npip install \"flatbuffers>=1.8.0\" # for the flatbuffers runtime (with Numpy)\ncd python # only implementation so far is in Python\npython setup.py install # to use it outside of this directory\n```\n\nNow you should be able to `import aghast` or `from aghast import *` in Python.\n\nIf you need to change `flatbuffers/aghast.fbs`, you'll need to additionally:\n\n 1. Get `flatc` to generate Python sources from `flatbuffers/aghast.fbs`. I use `conda install -c conda-forge flatbuffers`. (The `flatc` executable is _not_ included in the pip `flatbuffers` package, and the Python runtime is _not_ included in the conda `flatbuffers` package. They're disjoint.)\n 2. In the `python` directory, run `./generate_flatbuffers.py` (which calls `flatc` and does some post-processing).\n\nEvery time you change `flatbuffers/aghast.fbs`, re-run `./generate_flatbuffers.py`.\n\n## Documentation\n\nFull specification:\n\n * [Introduction](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#introduction)\n * [Data types](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#data-types)\n * [Collection](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#collection)\n * [Histogram](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#histogram)\n * [Axis](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#axis)\n * [IntegerBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#integerbinning)\n * [RegularBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#regularbinning)\n * [RealInterval](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#realinterval)\n * [RealOverflow](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#realoverflow)\n * [HexagonalBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#hexagonalbinning)\n * [EdgesBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#edgesbinning)\n * [IrregularBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#irregularbinning)\n * [CategoryBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#categorybinning)\n * [SparseRegularBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#sparseregularbinning)\n * [FractionBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#fractionbinning)\n * [PredicateBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#predicatebinning)\n * [VariationBinning](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#variationbinning)\n * [Variation](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#variation)\n * [Assignment](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#assignment)\n * [UnweightedCounts](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#unweightedcounts)\n * [WeightedCounts](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#weightedcounts)\n * [InterpretedInlineBuffer](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#interpretedinlinebuffer)\n * [InterpretedInlineInt64Buffer](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#interpretedinlineint64buffer)\n * [InterpretedInlineFloat64Buffer](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#interpretedinlinefloat64buffer)\n * [InterpretedExternalBuffer](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#interpretedexternalbuffer)\n * [Profile](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#profile)\n * [Statistics](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#statistics)\n * [Moments](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#moments)\n * [Quantiles](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#quantiles)\n * [Modes](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#modes)\n * [Extremes](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#extremes)\n * [StatisticFilter](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#statisticfilter)\n * [Covariance](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#covariance)\n * [ParameterizedFunction](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#parameterizedfunction)\n * [Parameter](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#parameter)\n * [EvaluatedFunction](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#evaluatedfunction)\n * [BinnedEvaluatedFunction](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#binnedevaluatedfunction)\n * [Ntuple](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#ntuple)\n * [Column](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#column)\n * [NtupleInstance](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#ntupleinstance)\n * [Chunk](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#chunk)\n * [ColumnChunk](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#columnchunk)\n * [Page](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#page)\n * [RawInlineBuffer](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#rawinlinebuffer)\n * [RawExternalBuffer](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#rawexternalbuffer)\n * [Metadata](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#metadata)\n * [Decoration](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#decoration)\n\n## Tutorial examples\n\n[Run this tutorial on Binder](https://mybinder.org/v2/gh/scikit-hep/aghast/master?urlpath=lab/tree/binder%2Ftutorial.ipynb).\n\n### Conversions\n\nThe main purpose of aghast is to move aggregated, histogram-like statistics (called \"ghasts\") from one framework to the next. This requires a conversion of high-level domain concepts.\n\nConsider the following example: in Numpy, a histogram is simply a 2-tuple of arrays with special meaning\u2014bin contents, then bin edges.\n\n\n```python\nimport numpy\n\nnumpy_hist = numpy.histogram(numpy.random.normal(0, 1, int(10e6)), bins=80, range=(-5, 5))\nnumpy_hist\n```\n\n\n\n\n (array([ 2, 5, 9, 15, 29, 49, 80, 104,\n 237, 352, 555, 867, 1447, 2046, 3037, 4562,\n 6805, 9540, 13529, 18584, 25593, 35000, 46024, 59103,\n 76492, 96441, 119873, 146159, 177533, 210628, 246316, 283292,\n 321377, 359314, 393857, 426446, 453031, 474806, 489846, 496646,\n 497922, 490499, 473200, 453527, 425650, 393297, 358537, 321099,\n 282519, 246469, 211181, 177550, 147417, 120322, 96592, 76665,\n 59587, 45776, 34459, 25900, 18876, 13576, 9571, 6662,\n 4629, 3161, 2069, 1334, 878, 581, 332, 220,\n 135, 65, 39, 26, 19, 15, 4, 4]),\n array([-5. , -4.875, -4.75 , -4.625, -4.5 , -4.375, -4.25 , -4.125,\n -4. , -3.875, -3.75 , -3.625, -3.5 , -3.375, -3.25 , -3.125,\n -3. , -2.875, -2.75 , -2.625, -2.5 , -2.375, -2.25 , -2.125,\n -2. , -1.875, -1.75 , -1.625, -1.5 , -1.375, -1.25 , -1.125,\n -1. , -0.875, -0.75 , -0.625, -0.5 , -0.375, -0.25 , -0.125,\n 0. , 0.125, 0.25 , 0.375, 0.5 , 0.625, 0.75 , 0.875,\n 1. , 1.125, 1.25 , 1.375, 1.5 , 1.625, 1.75 , 1.875,\n 2. , 2.125, 2.25 , 2.375, 2.5 , 2.625, 2.75 , 2.875,\n 3. , 3.125, 3.25 , 3.375, 3.5 , 3.625, 3.75 , 3.875,\n 4. , 4.125, 4.25 , 4.375, 4.5 , 4.625, 4.75 , 4.875,\n 5. ]))\n\n\n\nWe convert that into the aghast equivalent (a \"ghast\") with a connector (two functions: `from_numpy` and `to_numpy`).\n\n\n```python\nimport aghast\n\nghastly_hist = aghast.from_numpy(numpy_hist)\nghastly_hist\n```\n\n\n\n\n \n\n\n\nThis object is instantiated from a class structure built from simple pieces.\n\n\n```python\nghastly_hist.dump()\n```\n\n Histogram(\n axis=[\n Axis(binning=RegularBinning(num=80, interval=RealInterval(low=-5.0, high=5.0)))\n ],\n counts=\n UnweightedCounts(\n counts=\n InterpretedInlineInt64Buffer(\n buffer=\n [ 2 5 9 15 29 49 80 104 237 352\n 555 867 1447 2046 3037 4562 6805 9540 13529 18584\n 25593 35000 46024 59103 76492 96441 119873 146159 177533 210628\n 246316 283292 321377 359314 393857 426446 453031 474806 489846 496646\n 497922 490499 473200 453527 425650 393297 358537 321099 282519 246469\n 211181 177550 147417 120322 96592 76665 59587 45776 34459 25900\n 18876 13576 9571 6662 4629 3161 2069 1334 878 581\n 332 220 135 65 39 26 19 15 4 4])))\n\n\nNow it can be converted to a ROOT histogram with another connector.\n\n\n```python\nroot_hist = aghast.to_root(ghastly_hist, \"root_hist\")\nroot_hist\n```\n\n \n\n```python\nimport ROOT\ncanvas = ROOT.TCanvas()\nroot_hist.Draw()\ncanvas.Draw()\n```\n\n\n![png](docs/tutorial-9_0.png)\n\n\nAnd Pandas with yet another connector.\n\n\n```python\npandas_hist = aghast.to_pandas(ghastly_hist)\npandas_hist\n```\n\n\n\n\n
\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
unweighted
[-5.0, -4.875)2
[-4.875, -4.75)5
[-4.75, -4.625)9
[-4.625, -4.5)15
[-4.5, -4.375)29
[-4.375, -4.25)49
[-4.25, -4.125)80
[-4.125, -4.0)104
[-4.0, -3.875)237
[-3.875, -3.75)352
[-3.75, -3.625)555
[-3.625, -3.5)867
[-3.5, -3.375)1447
[-3.375, -3.25)2046
[-3.25, -3.125)3037
[-3.125, -3.0)4562
[-3.0, -2.875)6805
[-2.875, -2.75)9540
[-2.75, -2.625)13529
[-2.625, -2.5)18584
[-2.5, -2.375)25593
[-2.375, -2.25)35000
[-2.25, -2.125)46024
[-2.125, -2.0)59103
[-2.0, -1.875)76492
[-1.875, -1.75)96441
[-1.75, -1.625)119873
[-1.625, -1.5)146159
[-1.5, -1.375)177533
[-1.375, -1.25)210628
......
[1.25, 1.375)211181
[1.375, 1.5)177550
[1.5, 1.625)147417
[1.625, 1.75)120322
[1.75, 1.875)96592
[1.875, 2.0)76665
[2.0, 2.125)59587
[2.125, 2.25)45776
[2.25, 2.375)34459
[2.375, 2.5)25900
[2.5, 2.625)18876
[2.625, 2.75)13576
[2.75, 2.875)9571
[2.875, 3.0)6662
[3.0, 3.125)4629
[3.125, 3.25)3161
[3.25, 3.375)2069
[3.375, 3.5)1334
[3.5, 3.625)878
[3.625, 3.75)581
[3.75, 3.875)332
[3.875, 4.0)220
[4.0, 4.125)135
[4.125, 4.25)65
[4.25, 4.375)39
[4.375, 4.5)26
[4.5, 4.625)19
[4.625, 4.75)15
[4.75, 4.875)4
[4.875, 5.0)4
\n

80 rows \u00d7 1 columns

\n
\n\n\n\n### Serialization\n\nA ghast is also a [Flatbuffers](http://google.github.io/flatbuffers/) object, which has a [multi-lingual](https://google.github.io/flatbuffers/flatbuffers_support.html), [random-access](https://github.com/mzaks/FlatBuffersSwift/wiki/FlatBuffers-Explained), [small-footprint](http://google.github.io/flatbuffers/md__benchmarks.html) serialization:\n\n\n```python\nghastly_hist.tobuffer()\n```\n\n bytearray(\"\\x04\\x00\\x00\\x00\\x90\\xff\\xff\\xff\\x10\\x00\\x00\\x00\\x00\\x01\\n\\x00\\x10\\x00\\x0c\\x00\\x0b\\x00\\x04\n \\x00\\n\\x00\\x00\\x00`\\x00\\x00\\x00\\x00\\x00\\x00\\x01\\x04\\x00\\x00\\x00\\x01\\x00\\x00\\x00\\x0c\\x00\\x00\n \\x00\\x08\\x00\\x0c\\x00\\x0b\\x00\\x04\\x00\\x08\\x00\\x00\\x00\\x10\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x08\\x00\n (\\x00\\x1c\\x00\\x04\\x00\\x08\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x14\\xc0\\x00\\x00\\x00\\x00\\x00\n \\x00\\x14@\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00P\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x08\n \\x00\\n\\x00\\t\\x00\\x04\\x00\\x08\\x00\\x00\\x00\\x0c\\x00\\x00\\x00\\x00\\x02\\x06\\x00\\x08\\x00\\x04\\x00\\x06\n \\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x80\\x02\\x00\\x00\\x02\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x05\\x00\\x00\\x00\n \\x00\\x00\\x00\\x00\\t\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x0f\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x1d\\x00\\x00\n \\x00\\x00\\x00\\x00\\x001\\x00\\x00\\x00\\x00\\x00\\x00\\x00P\\x00\\x00\\x00\\x00\\x00\\x00\\x00h\\x00\\x00\\x00\n \\x00\\x00\\x00\\x00\\xed\\x00\\x00\\x00\\x00\\x00\\x00\\x00`\\x01\\x00\\x00\\x00\\x00\\x00\\x00+\\x02\\x00\\x00\n \\x00\\x00\\x00\\x00c\\x03\\x00\\x00\\x00\\x00\\x00\\x00\\xa7\\x05\\x00\\x00\\x00\\x00\\x00\\x00\\xfe\\x07\\x00\n \\x00\\x00\\x00\\x00\\x00\\xdd\\x0b\\x00\\x00\\x00\\x00\\x00\\x00\\xd2\\x11\\x00\\x00\\x00\\x00\\x00\\x00\\x95\\x1a\n \\x00\\x00\\x00\\x00\\x00\\x00D%\\x00\\x00\\x00\\x00\\x00\\x00\\xd94\\x00\\x00\\x00\\x00\\x00\\x00\\x98H\\x00\\x00\n \\x00\\x00\\x00\\x00\\xf9c\\x00\\x00\\x00\\x00\\x00\\x00\\xb8\\x88\\x00\\x00\\x00\\x00\\x00\\x00\\xc8\\xb3\\x00\\x00\n \\x00\\x00\\x00\\x00\\xdf\\xe6\\x00\\x00\\x00\\x00\\x00\\x00\\xcc*\\x01\\x00\\x00\\x00\\x00\\x00\\xb9x\\x01\\x00\n \\x00\\x00\\x00\\x00A\\xd4\\x01\\x00\\x00\\x00\\x00\\x00\\xef:\\x02\\x00\\x00\\x00\\x00\\x00}\\xb5\\x02\\x00\\x00\n \\x00\\x00\\x00\\xc46\\x03\\x00\\x00\\x00\\x00\\x00,\\xc2\\x03\\x00\\x00\\x00\\x00\\x00\\x9cR\\x04\\x00\\x00\\x00\n \\x00\\x00a\\xe7\\x04\\x00\\x00\\x00\\x00\\x00\\x92{\\x05\\x00\\x00\\x00\\x00\\x00\\x81\\x02\\x06\\x00\\x00\\x00\n \\x00\\x00\\xce\\x81\\x06\\x00\\x00\\x00\\x00\\x00\\xa7\\xe9\\x06\\x00\\x00\\x00\\x00\\x00\\xb6>\\x07\\x00\\x00\n \\x00\\x00\\x00vy\\x07\\x00\\x00\\x00\\x00\\x00\\x06\\x94\\x07\\x00\\x00\\x00\\x00\\x00\\x02\\x99\\x07\\x00\\x00\n \\x00\\x00\\x00\\x03|\\x07\\x00\\x00\\x00\\x00\\x00p8\\x07\\x00\\x00\\x00\\x00\\x00\\x97\\xeb\\x06\\x00\\x00\\x00\n \\x00\\x00\\xb2~\\x06\\x00\\x00\\x00\\x00\\x00Q\\x00\\x06\\x00\\x00\\x00\\x00\\x00\\x89x\\x05\\x00\\x00\\x00\\x00\n \\x00K\\xe6\\x04\\x00\\x00\\x00\\x00\\x00\\x97O\\x04\\x00\\x00\\x00\\x00\\x00\\xc5\\xc2\\x03\\x00\\x00\\x00\\x00\n \\x00\\xed8\\x03\\x00\\x00\\x00\\x00\\x00\\x8e\\xb5\\x02\\x00\\x00\\x00\\x00\\x00\\xd9?\\x02\\x00\\x00\\x00\\x00\n \\x00\\x02\\xd6\\x01\\x00\\x00\\x00\\x00\\x00Py\\x01\\x00\\x00\\x00\\x00\\x00y+\\x01\\x00\\x00\\x00\\x00\\x00\\xc3\n \\xe8\\x00\\x00\\x00\\x00\\x00\\x00\\xd0\\xb2\\x00\\x00\\x00\\x00\\x00\\x00\\x9b\\x86\\x00\\x00\\x00\\x00\\x00\\x00\n ,e\\x00\\x00\\x00\\x00\\x00\\x00\\xbcI\\x00\\x00\\x00\\x00\\x00\\x00\\x085\\x00\\x00\\x00\\x00\\x00\\x00c%\\x00\n \\x00\\x00\\x00\\x00\\x00\\x06\\x1a\\x00\\x00\\x00\\x00\\x00\\x00\\x15\\x12\\x00\\x00\\x00\\x00\\x00\\x00Y\\x0c\n \\x00\\x00\\x00\\x00\\x00\\x00\\x15\\x08\\x00\\x00\\x00\\x00\\x00\\x006\\x05\\x00\\x00\\x00\\x00\\x00\\x00n\\x03\n \\x00\\x00\\x00\\x00\\x00\\x00E\\x02\\x00\\x00\\x00\\x00\\x00\\x00L\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\xdc\\x00\n \\x00\\x00\\x00\\x00\\x00\\x00\\x87\\x00\\x00\\x00\\x00\\x00\\x00\\x00A\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\'\\x00\n \\x00\\x00\\x00\\x00\\x00\\x00\\x1a\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x13\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x0f\n \\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00\")\n\n```python\nprint(\"Numpy size: \", numpy_hist[0].nbytes + numpy_hist[1].nbytes)\n\ntmessage = ROOT.TMessage()\ntmessage.WriteObject(root_hist)\nprint(\"ROOT size: \", tmessage.Length())\n\nimport pickle\nprint(\"Pandas size:\", len(pickle.dumps(pandas_hist)))\n\nprint(\"Aghast size: \", len(ghastly_hist.tobuffer()))\n```\n\n Numpy size: 1288\n ROOT size: 1962\n Pandas size: 2984\n Aghast size: 792\n\n\nAghast is generally forseen as a memory format, like [Apache Arrow](https://arrow.apache.org), but for statistical aggregations. Like Arrow, it reduces the need to implement $N(N - 1)/2$ conversion functions among $N$ statistical libraries to just $N$ conversion functions. (See the figure on Arrow's website.)\n\n### Translation of conventions\n\nAghast also intends to be as close to zero-copy as possible. This means that it must make graceful translations among conventions. Different histogramming libraries handle overflow bins in different ways:\n\n\n```python\nfromroot = aghast.from_root(root_hist)\nfromroot.axis[0].binning.dump()\nprint(\"Bin contents length:\", len(fromroot.counts.array))\n```\n\n RegularBinning(\n num=80,\n interval=RealInterval(low=-5.0, high=5.0),\n overflow=RealOverflow(loc_underflow=BinLocation.below1, loc_overflow=BinLocation.above1))\n Bin contents length: 82\n\n\n\n```python\nghastly_hist.axis[0].binning.dump()\nprint(\"Bin contents length:\", len(ghastly_hist.counts.array))\n```\n\n RegularBinning(num=80, interval=RealInterval(low=-5.0, high=5.0))\n Bin contents length: 80\n\n\nAnd yet we want to be able to manipulate them as though these differences did not exist.\n\n\n```python\nsum_hist = fromroot + ghastly_hist\n```\n\n\n```python\nsum_hist.axis[0].binning.dump()\nprint(\"Bin contents length:\", len(sum_hist.counts.array))\n```\n\n RegularBinning(\n num=80,\n interval=RealInterval(low=-5.0, high=5.0),\n overflow=RealOverflow(loc_underflow=BinLocation.above1, loc_overflow=BinLocation.above2))\n Bin contents length: 82\n\n\nThe binning structure keeps track of the existence of underflow/overflow bins and where they are located.\n\n * ROOT's convention is to put underflow before the normal bins (`below1`) and overflow after (`above1`), so that the normal bins are effectively 1-indexed.\n * Boost.Histogram's convention is to put overflow after the normal bins (`above1`) and underflow after that (`above2`), so that underflow is accessed via `myhist[-1]` in Numpy.\n * Numpy histograms don't have underflow/overflow bins.\n * Pandas could have `Intervals` that extend to infinity.\n\nAghast accepts all of these, so that it doesn't have to manipulate the bin contents buffer it receives, but knows how to deal with them if it has to combine histograms that follow different conventions.\n\n### Binning types\n\nAll the different axis types have an equivalent in aghast (and not all are single-dimensional).\n\n\n```python\naghast.IntegerBinning(5, 10).dump()\naghast.RegularBinning(100, aghast.RealInterval(-5, 5)).dump()\naghast.HexagonalBinning(0, 100, 0, 100, aghast.HexagonalBinning.cube_xy).dump()\naghast.EdgesBinning([0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100]).dump()\naghast.IrregularBinning([aghast.RealInterval(0, 5),\n aghast.RealInterval(10, 100),\n aghast.RealInterval(-10, 10)],\n overlapping_fill=aghast.IrregularBinning.all).dump()\naghast.CategoryBinning([\"one\", \"two\", \"three\"]).dump()\naghast.SparseRegularBinning([5, 3, -2, 8, -100], 10).dump()\naghast.FractionBinning(error_method=aghast.FractionBinning.clopper_pearson).dump()\naghast.PredicateBinning([\"signal region\", \"control region\"]).dump()\naghast.VariationBinning([aghast.Variation([aghast.Assignment(\"x\", \"nominal\")]),\n aghast.Variation([aghast.Assignment(\"x\", \"nominal + sigma\")]),\n aghast.Variation([aghast.Assignment(\"x\", \"nominal - sigma\")])]).dump()\n```\n\n IntegerBinning(min=5, max=10)\n RegularBinning(num=100, interval=RealInterval(low=-5.0, high=5.0))\n HexagonalBinning(qmin=0, qmax=100, rmin=0, rmax=100, coordinates=HexagonalBinning.cube_xy)\n EdgesBinning(edges=[0.01 0.05 0.1 0.5 1 5 10 50 100])\n IrregularBinning(\n intervals=[\n RealInterval(low=0.0, high=5.0),\n RealInterval(low=10.0, high=100.0),\n RealInterval(low=-10.0, high=10.0)\n ],\n overlapping_fill=IrregularBinning.all)\n CategoryBinning(categories=['one', 'two', 'three'])\n SparseRegularBinning(bins=[5 3 -2 8 -100], bin_width=10.0)\n FractionBinning(error_method=FractionBinning.clopper_pearson)\n PredicateBinning(predicates=['signal region', 'control region'])\n VariationBinning(\n variations=[\n Variation(assignments=[\n Assignment(identifier='x', expression='nominal')\n ]),\n Variation(\n assignments=[\n Assignment(identifier='x', expression='nominal + sigma')\n ]),\n Variation(\n assignments=[\n Assignment(identifier='x', expression='nominal - sigma')\n ])\n ])\n\n\nThe meanings of these binning classes are given in [the specification](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#integerbinning), but many of them can be converted into one another, and converting to `CategoryBinning` (strings) often makes the intent clear.\n\n\n```python\naghast.IntegerBinning(5, 10).toCategoryBinning().dump()\naghast.RegularBinning(10, aghast.RealInterval(-5, 5)).toCategoryBinning().dump()\naghast.EdgesBinning([0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100]).toCategoryBinning().dump()\naghast.IrregularBinning([aghast.RealInterval(0, 5),\n aghast.RealInterval(10, 100),\n aghast.RealInterval(-10, 10)],\n overlapping_fill=aghast.IrregularBinning.all).toCategoryBinning().dump()\naghast.SparseRegularBinning([5, 3, -2, 8, -100], 10).toCategoryBinning().dump()\naghast.FractionBinning(error_method=aghast.FractionBinning.clopper_pearson).toCategoryBinning().dump()\naghast.PredicateBinning([\"signal region\", \"control region\"]).toCategoryBinning().dump()\naghast.VariationBinning([aghast.Variation([aghast.Assignment(\"x\", \"nominal\")]),\n aghast.Variation([aghast.Assignment(\"x\", \"nominal + sigma\")]),\n aghast.Variation([aghast.Assignment(\"x\", \"nominal - sigma\")])]\n ).toCategoryBinning().dump()\n```\n\n CategoryBinning(categories=['5', '6', '7', '8', '9', '10'])\n CategoryBinning(\n categories=['[-5, -4)', '[-4, -3)', '[-3, -2)', '[-2, -1)', '[-1, 0)', '[0, 1)', '[1, 2)', '[2, 3)',\n '[3, 4)', '[4, 5)'])\n CategoryBinning(\n categories=['[0.01, 0.05)', '[0.05, 0.1)', '[0.1, 0.5)', '[0.5, 1)', '[1, 5)', '[5, 10)', '[10, 50)',\n '[50, 100)'])\n CategoryBinning(categories=['[0, 5)', '[10, 100)', '[-10, 10)'])\n CategoryBinning(categories=['[50, 60)', '[30, 40)', '[-20, -10)', '[80, 90)', '[-1000, -990)'])\n CategoryBinning(categories=['pass', 'all'])\n CategoryBinning(categories=['signal region', 'control region'])\n CategoryBinning(categories=['x := nominal', 'x := nominal + sigma', 'x := nominal - sigma'])\n\n\nThis technique can also clear up confusion about overflow bins.\n\n\n```python\naghast.RegularBinning(5, aghast.RealInterval(-5, 5), aghast.RealOverflow(\n loc_underflow=aghast.BinLocation.above2,\n loc_overflow=aghast.BinLocation.above1,\n loc_nanflow=aghast.BinLocation.below1\n )).toCategoryBinning().dump()\n```\n\n CategoryBinning(\n categories=['{nan}', '[-5, -3)', '[-3, -1)', '[-1, 1)', '[1, 3)', '[3, 5)', '[5, +inf]',\n '[-inf, -5)'])\n\n\n## Fancy binning types\n\nYou might also be wondering about `FractionBinning`, `PredicateBinning`, and `VariationBinning`.\n\n`FractionBinning` is an axis of two bins: #passing and #total, #failing and #total, or #passing and #failing. Adding it to another axis effectively makes an \"efficiency plot.\"\n\n\n```python\nh = aghast.Histogram([aghast.Axis(aghast.FractionBinning()),\n aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5)))],\n aghast.UnweightedCounts(\n aghast.InterpretedInlineBuffer.fromarray(\n numpy.array([[ 9, 25, 29, 35, 54, 67, 60, 84, 80, 94],\n [ 99, 119, 109, 109, 95, 104, 102, 106, 112, 122]]))))\ndf = aghast.to_pandas(h)\ndf\n```\n\n\n\n\n
\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
unweighted
pass[-5.0, -4.0)9
[-4.0, -3.0)25
[-3.0, -2.0)29
[-2.0, -1.0)35
[-1.0, 0.0)54
[0.0, 1.0)67
[1.0, 2.0)60
[2.0, 3.0)84
[3.0, 4.0)80
[4.0, 5.0)94
all[-5.0, -4.0)99
[-4.0, -3.0)119
[-3.0, -2.0)109
[-2.0, -1.0)109
[-1.0, 0.0)95
[0.0, 1.0)104
[1.0, 2.0)102
[2.0, 3.0)106
[3.0, 4.0)112
[4.0, 5.0)122
\n
\n\n\n\n\n```python\ndf = df.unstack(level=0)\ndf\n```\n\n\n\n\n
\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
unweighted
allpass
[-5.0, -4.0)999
[-4.0, -3.0)11925
[-3.0, -2.0)10929
[-2.0, -1.0)10935
[-1.0, 0.0)9554
[0.0, 1.0)10467
[1.0, 2.0)10260
[2.0, 3.0)10684
[3.0, 4.0)11280
[4.0, 5.0)12294
\n
\n\n\n\n\n```python\ndf[\"unweighted\", \"pass\"] / df[\"unweighted\", \"all\"]\n```\n\n\n\n\n [-5.0, -4.0) 0.090909\n [-4.0, -3.0) 0.210084\n [-3.0, -2.0) 0.266055\n [-2.0, -1.0) 0.321101\n [-1.0, 0.0) 0.568421\n [0.0, 1.0) 0.644231\n [1.0, 2.0) 0.588235\n [2.0, 3.0) 0.792453\n [3.0, 4.0) 0.714286\n [4.0, 5.0) 0.770492\n dtype: float64\n\n\n\n`PredicateBinning` means that each bin represents a predicate (if-then rule) in the filling procedure. Aghast doesn't _have_ a filling procedure, but filling-libraries can use this to encode relationships among histograms that a fitting-library can take advantage of, for combined signal-control region fits, for instance. It's possible for those regions to overlap: an input datum might satisfy more than one predicate, and `overlapping_fill` determines which bin(s) were chosen: `first`, `last`, or `all`.\n\n`VariationBinning` means that each bin represents a variation of one of the paramters used to calculate the fill-variables. This is used to determine sensitivity to systematic effects, by varying them and re-filling. In this kind of binning, the same input datum enters every bin.\n\n\n```python\nxdata = numpy.random.normal(0, 1, int(1e6))\nsigma = numpy.random.uniform(-0.1, 0.8, int(1e6))\n\nh = aghast.Histogram([aghast.Axis(aghast.VariationBinning([\n aghast.Variation([aghast.Assignment(\"x\", \"nominal\")]),\n aghast.Variation([aghast.Assignment(\"x\", \"nominal + sigma\")])])),\n aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5)))],\n aghast.UnweightedCounts(\n aghast.InterpretedInlineBuffer.fromarray(\n numpy.concatenate([\n numpy.histogram(xdata, bins=10, range=(-5, 5))[0],\n numpy.histogram(xdata + sigma, bins=10, range=(-5, 5))[0]]))))\ndf = aghast.to_pandas(h)\ndf\n```\n\n\n\n\n
\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
unweighted
x := nominal[-5.0, -4.0)31
[-4.0, -3.0)1309
[-3.0, -2.0)21624
[-2.0, -1.0)135279
[-1.0, 0.0)341683
[0.0, 1.0)341761
[1.0, 2.0)135675
[2.0, 3.0)21334
[3.0, 4.0)1273
[4.0, 5.0)31
x := nominal + sigma[-5.0, -4.0)14
[-4.0, -3.0)559
[-3.0, -2.0)10814
[-2.0, -1.0)84176
[-1.0, 0.0)271999
[0.0, 1.0)367950
[1.0, 2.0)209479
[2.0, 3.0)49997
[3.0, 4.0)4815
[4.0, 5.0)193
\n
\n\n\n\n\n```python\ndf.unstack(level=0)\n```\n\n\n\n\n
\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
unweighted
x := nominalx := nominal + sigma
[-5.0, -4.0)3114
[-4.0, -3.0)1309559
[-3.0, -2.0)2162410814
[-2.0, -1.0)13527984176
[-1.0, 0.0)341683271999
[0.0, 1.0)341761367950
[1.0, 2.0)135675209479
[2.0, 3.0)2133449997
[3.0, 4.0)12734815
[4.0, 5.0)31193
\n
\n\n\n\n### Collections\n\nYou can gather many objects (histograms, functions, ntuples) into a `Collection`, partly for convenience of encapsulating all of them in one object.\n\n\n```python\naghast.Collection({\"one\": fromroot, \"two\": ghastly_hist}).dump()\n```\n\n Collection(\n objects={\n 'one': Histogram(\n axis=[\n Axis(\n binning=\n RegularBinning(\n num=80,\n interval=RealInterval(low=-5.0, high=5.0),\n overflow=RealOverflow(loc_underflow=BinLocation.below1, loc_overflow=BinLocation.above1)),\n statistics=[\n Statistics(\n moments=[\n Moments(sumwxn=InterpretedInlineInt64Buffer(buffer=[1e+07]), n=0),\n Moments(sumwxn=InterpretedInlineFloat64Buffer(buffer=[1e+07]), n=0, weightpower=1),\n Moments(sumwxn=InterpretedInlineFloat64Buffer(buffer=[1e+07]), n=0, weightpower=2),\n Moments(sumwxn=InterpretedInlineFloat64Buffer(buffer=[2468.31]), n=1, weightpower=1),\n Moments(\n sumwxn=InterpretedInlineFloat64Buffer(buffer=[1.00118e+07]),\n n=2,\n weightpower=1)\n ])\n ])\n ],\n counts=\n UnweightedCounts(\n counts=\n InterpretedInlineFloat64Buffer(\n buffer=\n [0.00000e+00 2.00000e+00 5.00000e+00 9.00000e+00 1.50000e+01 2.90000e+01\n 4.90000e+01 8.00000e+01 1.04000e+02 2.37000e+02 3.52000e+02 5.55000e+02\n 8.67000e+02 1.44700e+03 2.04600e+03 3.03700e+03 4.56200e+03 6.80500e+03\n 9.54000e+03 1.35290e+04 1.85840e+04 2.55930e+04 3.50000e+04 4.60240e+04\n 5.91030e+04 7.64920e+04 9.64410e+04 1.19873e+05 1.46159e+05 1.77533e+05\n 2.10628e+05 2.46316e+05 2.83292e+05 3.21377e+05 3.59314e+05 3.93857e+05\n 4.26446e+05 4.53031e+05 4.74806e+05 4.89846e+05 4.96646e+05 4.97922e+05\n 4.90499e+05 4.73200e+05 4.53527e+05 4.25650e+05 3.93297e+05 3.58537e+05\n 3.21099e+05 2.82519e+05 2.46469e+05 2.11181e+05 1.77550e+05 1.47417e+05\n 1.20322e+05 9.65920e+04 7.66650e+04 5.95870e+04 4.57760e+04 3.44590e+04\n 2.59000e+04 1.88760e+04 1.35760e+04 9.57100e+03 6.66200e+03 4.62900e+03\n 3.16100e+03 2.06900e+03 1.33400e+03 8.78000e+02 5.81000e+02 3.32000e+02\n 2.20000e+02 1.35000e+02 6.50000e+01 3.90000e+01 2.60000e+01 1.90000e+01\n 1.50000e+01 4.00000e+00 4.00000e+00 0.00000e+00]))),\n 'two': Histogram(\n axis=[\n Axis(binning=RegularBinning(num=80, interval=RealInterval(low=-5.0, high=5.0)))\n ],\n counts=\n UnweightedCounts(\n counts=\n InterpretedInlineInt64Buffer(\n buffer=\n [ 2 5 9 15 29 49 80 104 237 352\n 555 867 1447 2046 3037 4562 6805 9540 13529 18584\n 25593 35000 46024 59103 76492 96441 119873 146159 177533 210628\n 246316 283292 321377 359314 393857 426446 453031 474806 489846 496646\n 497922 490499 473200 453527 425650 393297 358537 321099 282519 246469\n 211181 177550 147417 120322 96592 76665 59587 45776 34459 25900\n 18876 13576 9571 6662 4629 3161 2069 1334 878 581\n 332 220 135 65 39 26 19 15 4 4])))\n })\n\n\nNot only for convenience: [you can also define](https://github.com/scikit-hep/aghast/blob/master/specification.adoc#Collection) an `Axis` in the `Collection` to subdivide all contents by that `Axis`. For instance, you can make a collection of qualitatively different histograms all have a signal and control region with `PredicateBinning`, or all have systematic variations with `VariationBinning`.\n\nIt is not necessary to rely on naming conventions to communicate this information from filler to fitter.\n\n### Histogram \u2192 histogram conversions\n\nI said in the introduction that aghast does not fill histograms and does not plot histograms\u2014the two things data analysts are expecting to do. These would be done by user-facing libraries.\n\nAghast does, however, transform histograms into other histograms, and not just among formats. You can combine histograms with `+`. In addition to adding histogram counts, it combines auxiliary statistics appropriately (if possible).\n\n\n```python\nh1 = aghast.Histogram([\n aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5)),\n statistics=[aghast.Statistics(\n moments=[\n aghast.Moments(aghast.InterpretedInlineBuffer.fromarray(numpy.array([10])), n=1),\n aghast.Moments(aghast.InterpretedInlineBuffer.fromarray(numpy.array([20])), n=2)],\n quantiles=[\n aghast.Quantiles(aghast.InterpretedInlineBuffer.fromarray(numpy.array([30])), p=0.5)],\n mode=aghast.Modes(aghast.InterpretedInlineBuffer.fromarray(numpy.array([40]))),\n min=aghast.Extremes(aghast.InterpretedInlineBuffer.fromarray(numpy.array([50]))),\n max=aghast.Extremes(aghast.InterpretedInlineBuffer.fromarray(numpy.array([60]))))])],\n aghast.UnweightedCounts(aghast.InterpretedInlineBuffer.fromarray(numpy.arange(10))))\nh2 = aghast.Histogram([\n aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5)),\n statistics=[aghast.Statistics(\n moments=[\n aghast.Moments(aghast.InterpretedInlineBuffer.fromarray(numpy.array([100])), n=1),\n aghast.Moments(aghast.InterpretedInlineBuffer.fromarray(numpy.array([200])), n=2)],\n quantiles=[\n aghast.Quantiles(aghast.InterpretedInlineBuffer.fromarray(numpy.array([300])), p=0.5)],\n mode=aghast.Modes(aghast.InterpretedInlineBuffer.fromarray(numpy.array([400]))),\n min=aghast.Extremes(aghast.InterpretedInlineBuffer.fromarray(numpy.array([500]))),\n max=aghast.Extremes(aghast.InterpretedInlineBuffer.fromarray(numpy.array([600]))))])],\n aghast.UnweightedCounts(aghast.InterpretedInlineBuffer.fromarray(numpy.arange(100, 200, 10))))\n```\n\n\n```python\n(h1 + h2).dump()\n```\n\n Histogram(\n axis=[\n Axis(\n binning=RegularBinning(num=10, interval=RealInterval(low=-5.0, high=5.0)),\n statistics=[\n Statistics(\n moments=[\n Moments(sumwxn=InterpretedInlineInt64Buffer(buffer=[110]), n=1),\n Moments(sumwxn=InterpretedInlineInt64Buffer(buffer=[220]), n=2)\n ],\n min=Extremes(values=InterpretedInlineInt64Buffer(buffer=[50])),\n max=Extremes(values=InterpretedInlineInt64Buffer(buffer=[600])))\n ])\n ],\n counts=\n UnweightedCounts(\n counts=InterpretedInlineInt64Buffer(buffer=[100 111 122 133 144 155 166 177 188 199])))\n\n\nThe corresponding moments of `h1` and `h2` were matched and added, quantiles and modes were dropped (no way to combine them), and the correct minimum and maximum were picked; the histogram contents were added as well.\n\nAnother important histogram \u2192 histogram conversion is axis-reduction, which can take three forms:\n\n * slicing an axis, either dropping the eliminated bins or adding them to underflow/overflow (if possible, depends on binning type);\n * rebinning by combining neighboring bins;\n * projecting out an axis, removing it entirely, summing over all existing bins.\n\nAll of these operations use a Pandas-inspired `loc`/`iloc` syntax.\n\n\n```python\nh = aghast.Histogram(\n [aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5)))],\n aghast.UnweightedCounts(\n aghast.InterpretedInlineBuffer.fromarray(numpy.array([0, 10, 20, 30, 40, 50, 60, 70, 80, 90]))))\n```\n\n`loc` slices in the data's coordinate system. `1.5` rounds up to bin index `6`. The first five bins get combined into an overflow bin: `150 = 10 + 20 + 30 + 40 + 50`.\n\n\n```python\nh.loc[1.5:].dump()\n```\n\n Histogram(\n axis=[\n Axis(\n binning=\n RegularBinning(\n num=4,\n interval=RealInterval(low=1.0, high=5.0),\n overflow=\n RealOverflow(\n loc_underflow=BinLocation.above1,\n minf_mapping=RealOverflow.missing,\n pinf_mapping=RealOverflow.missing,\n nan_mapping=RealOverflow.missing)))\n ],\n counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[60 70 80 90 150])))\n\n\n`iloc` slices by bin index number.\n\n\n```python\nh.iloc[6:].dump()\n```\n\n Histogram(\n axis=[\n Axis(\n binning=\n RegularBinning(\n num=4,\n interval=RealInterval(low=1.0, high=5.0),\n overflow=\n RealOverflow(\n loc_underflow=BinLocation.above1,\n minf_mapping=RealOverflow.missing,\n pinf_mapping=RealOverflow.missing,\n nan_mapping=RealOverflow.missing)))\n ],\n counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[60 70 80 90 150])))\n\n\nSlices have a `start`, `stop`, and `step` (`start:stop:step`). The `step` parameter rebins:\n\n\n```python\nh.iloc[::2].dump()\n```\n\n Histogram(\n axis=[\n Axis(binning=RegularBinning(num=5, interval=RealInterval(low=-5.0, high=5.0)))\n ],\n counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[10 50 90 130 170])))\n\n\nThus, you can slice and rebin as part of the same operation.\n\nProjecting uses the same mechanism, except that `None` passed as an axis's slice projects it.\n\n\n```python\nh2 = aghast.Histogram(\n [aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5))),\n aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5)))],\n aghast.UnweightedCounts(\n aghast.InterpretedInlineBuffer.fromarray(numpy.arange(100))))\n\nh2.iloc[:, None].dump()\n```\n\n Histogram(\n axis=[\n Axis(binning=RegularBinning(num=10, interval=RealInterval(low=-5.0, high=5.0)))\n ],\n counts=\n UnweightedCounts(\n counts=InterpretedInlineInt64Buffer(buffer=[45 145 245 345 445 545 645 745 845 945])))\n\n\nThus, all three axis reduction operations can be performed in a single syntax.\n\nIn general, an n-dimensional ghastly histogram can be sliced like an n-dimensional Numpy array. This includes integer and boolean indexing (though that necessarily changes the binning to `IrregularBinning`).\n\n\n```python\nh.iloc[[4, 3, 6, 7, 1]].dump()\n```\n\n Histogram(\n axis=[\n Axis(\n binning=\n IrregularBinning(\n intervals=[\n RealInterval(low=-1.0, high=0.0),\n RealInterval(low=-2.0, high=-1.0),\n RealInterval(low=1.0, high=2.0),\n RealInterval(low=2.0, high=3.0),\n RealInterval(low=-4.0, high=-3.0)\n ]))\n ],\n counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[40 30 60 70 10])))\n\n\n\n```python\nh.iloc[[True, False, True, False, True, False, True, False, True, False]].dump()\n```\n\n Histogram(\n axis=[\n Axis(\n binning=\n IrregularBinning(\n intervals=[\n RealInterval(low=-5.0, high=-4.0),\n RealInterval(low=-3.0, high=-2.0),\n RealInterval(low=-1.0, high=0.0),\n RealInterval(low=1.0, high=2.0),\n RealInterval(low=3.0, high=4.0)\n ]))\n ],\n counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[0 20 40 60 80])))\n\n\n`loc` for numerical binnings accepts\n\n * a real number\n * a real-valued slice\n * `None` for projection\n * ellipsis (`...`)\n\n`loc` for categorical binnings accepts\n\n * a string\n * an iterable of strings\n * an _empty_ slice\n * `None` for projection\n * ellipsis (`...`)\n\n`iloc` accepts\n\n * an integer\n * an integer-valued slice\n * `None` for projection\n * integer-valued array-like\n * boolean-valued array-like\n * ellipsis (`...`)\n\n### Bin counts \u2192 Numpy\n\nFrequently, one wants to extract bin counts from a histogram. The `loc`/`iloc` syntax above creates _histograms_ from _histograms_, not bin counts.\n\nA histogram's `counts` property has a slice syntax.\n\n\n```python\nallcounts = numpy.arange(12) * numpy.arange(12)[:, None] # multiplication table\nallcounts[10, :] = -999 # underflows\nallcounts[11, :] = 999 # overflows\nallcounts[:, 0] = -999 # underflows\nallcounts[:, 1] = 999 # overflows\nprint(allcounts)\n```\n\n [[-999 999 0 0 0 0 0 0 0 0 0 0]\n [-999 999 2 3 4 5 6 7 8 9 10 11]\n [-999 999 4 6 8 10 12 14 16 18 20 22]\n [-999 999 6 9 12 15 18 21 24 27 30 33]\n [-999 999 8 12 16 20 24 28 32 36 40 44]\n [-999 999 10 15 20 25 30 35 40 45 50 55]\n [-999 999 12 18 24 30 36 42 48 54 60 66]\n [-999 999 14 21 28 35 42 49 56 63 70 77]\n [-999 999 16 24 32 40 48 56 64 72 80 88]\n [-999 999 18 27 36 45 54 63 72 81 90 99]\n [-999 999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999]\n [-999 999 999 999 999 999 999 999 999 999 999 999]]\n\n\n\n```python\nh2 = aghast.Histogram(\n [aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5),\n aghast.RealOverflow(loc_underflow=aghast.RealOverflow.above1,\n loc_overflow=aghast.RealOverflow.above2))),\n aghast.Axis(aghast.RegularBinning(10, aghast.RealInterval(-5, 5),\n aghast.RealOverflow(loc_underflow=aghast.RealOverflow.below2,\n loc_overflow=aghast.RealOverflow.below1)))],\n aghast.UnweightedCounts(\n aghast.InterpretedInlineBuffer.fromarray(allcounts)))\n```\n\n\n```python\nprint(h2.counts[:, :])\n```\n\n [[ 0 0 0 0 0 0 0 0 0 0]\n [ 2 3 4 5 6 7 8 9 10 11]\n [ 4 6 8 10 12 14 16 18 20 22]\n [ 6 9 12 15 18 21 24 27 30 33]\n [ 8 12 16 20 24 28 32 36 40 44]\n [10 15 20 25 30 35 40 45 50 55]\n [12 18 24 30 36 42 48 54 60 66]\n [14 21 28 35 42 49 56 63 70 77]\n [16 24 32 40 48 56 64 72 80 88]\n [18 27 36 45 54 63 72 81 90 99]]\n\n\nTo get the underflows and overflows, set the slice extremes to `-inf` and `+inf`.\n\n\n```python\nprint(h2.counts[-numpy.inf:numpy.inf, :])\n```\n\n [[-999 -999 -999 -999 -999 -999 -999 -999 -999 -999]\n [ 0 0 0 0 0 0 0 0 0 0]\n [ 2 3 4 5 6 7 8 9 10 11]\n [ 4 6 8 10 12 14 16 18 20 22]\n [ 6 9 12 15 18 21 24 27 30 33]\n [ 8 12 16 20 24 28 32 36 40 44]\n [ 10 15 20 25 30 35 40 45 50 55]\n [ 12 18 24 30 36 42 48 54 60 66]\n [ 14 21 28 35 42 49 56 63 70 77]\n [ 16 24 32 40 48 56 64 72 80 88]\n [ 18 27 36 45 54 63 72 81 90 99]\n [ 999 999 999 999 999 999 999 999 999 999]]\n\n\n\n```python\nprint(h2.counts[:, -numpy.inf:numpy.inf])\n```\n\n [[-999 0 0 0 0 0 0 0 0 0 0 999]\n [-999 2 3 4 5 6 7 8 9 10 11 999]\n [-999 4 6 8 10 12 14 16 18 20 22 999]\n [-999 6 9 12 15 18 21 24 27 30 33 999]\n [-999 8 12 16 20 24 28 32 36 40 44 999]\n [-999 10 15 20 25 30 35 40 45 50 55 999]\n [-999 12 18 24 30 36 42 48 54 60 66 999]\n [-999 14 21 28 35 42 49 56 63 70 77 999]\n [-999 16 24 32 40 48 56 64 72 80 88 999]\n [-999 18 27 36 45 54 63 72 81 90 99 999]]\n\n\nAlso note that the underflows are now all below the normal bins and overflows are now all above the normal bins, regardless of how they were arranged in the ghast. This allows analysis code to be independent of histogram source.\n\n## Other types\n\nAghast can attach fit functions to histograms, can store standalone functions, such as lookup tables, and can store ntuples for unweighted fits or machine learning.\n\n# Acknowledgements\n\nSupport for this work was provided by NSF cooperative agreement OAC-1836650 (IRIS-HEP), grant OAC-1450377 (DIANA/HEP) and PHY-1520942 (US-CMS LHC Ops).\n\nThanks especially to the gracious help of [aghast contributors](https://github.com/scikit-hep/aghast/graphs/contributors)!\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/scikit-hep/aghast/releases", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/scikit-hep/aghast", "keywords": "", "license": "BSD 3-clause", "maintainer": "Jim Pivarski (IRIS-HEP)", "maintainer_email": "pivarski@princeton.edu", "name": "aghast", "package_url": "https://pypi.org/project/aghast/", "platform": "Any", "project_url": "https://pypi.org/project/aghast/", "project_urls": { "Download": "https://github.com/scikit-hep/aghast/releases", "Homepage": "https://github.com/scikit-hep/aghast" }, "release_url": "https://pypi.org/project/aghast/0.2.1/", "requires_dist": [ "flatbuffers (>=1.8.0)", "numpy" ], "requires_python": "", "summary": "Aghast: aggregated, histogram-like statistics, sharable as Flatbuffers.", "version": "0.2.1" }, "last_serial": 5135796, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "08559499a6864632e2969101f9e1da5f", "sha256": "40797f3ba946aa3b8660e02446c6762362a7887627494be5d9981832a7afa9ab" }, "downloads": -1, "filename": "aghast-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "08559499a6864632e2969101f9e1da5f", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 132674, "upload_time": "2019-03-31T14:08:46", "url": "https://files.pythonhosted.org/packages/02/44/0e44368da8bdf879f6bf68151c73c2e5cf51d6758bf87c0d160902dfba90/aghast-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "49c44457ddc64bad447ce6a214b80487", "sha256": "cfcc0d0786d5f48c08f1c4f53bb0ece903dca223edc40548e886976f88291c99" }, "downloads": -1, "filename": "aghast-0.1.0.tar.gz", "has_sig": false, "md5_digest": "49c44457ddc64bad447ce6a214b80487", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 109577, "upload_time": "2019-03-31T14:08:47", "url": "https://files.pythonhosted.org/packages/68/5a/d6e8b422e397d5219021c81fd12d234cfacd3ed30264be7ff6b2ed094531/aghast-0.1.0.tar.gz" } ], "0.1.0rc1": [ { "comment_text": "", "digests": { "md5": "2110989b49ef1c69e8646cbe88cf55df", "sha256": "d148cb9a46e3f6a3e8b2384eaeb18af4faa16b5995738ad9356461957841bedd" }, "downloads": -1, "filename": "aghast-0.1.0rc1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "2110989b49ef1c69e8646cbe88cf55df", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 131429, "upload_time": "2019-03-31T00:56:57", "url": "https://files.pythonhosted.org/packages/d2/33/cc405a95ad720a1c64e5d41dea5688b44bf049cd3a9fb05b1128ac1b9050/aghast-0.1.0rc1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6a9231028f169e0a2e397c21974b93e3", "sha256": "d661db5f6a22e4705c0849612fa4cc6a9425f11f59035291208be6bfdb17602f" }, "downloads": -1, "filename": "aghast-0.1.0rc1.tar.gz", "has_sig": false, "md5_digest": "6a9231028f169e0a2e397c21974b93e3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 88582, "upload_time": "2019-03-31T00:57:01", "url": "https://files.pythonhosted.org/packages/6d/23/2c12782a99007514d5d6f93ae8e2a40c84fdc62867e0d23d3126fbe088b2/aghast-0.1.0rc1.tar.gz" } ], "0.1.0rc2": [ { "comment_text": "", "digests": { "md5": "7a665a0ac9cc802bac6fc9841c263b06", "sha256": "4c9069db8a65790afe1a127e5c3988379b6f9309411d57ae4d61d384a1761187" }, "downloads": -1, "filename": "aghast-0.1.0rc2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "7a665a0ac9cc802bac6fc9841c263b06", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 131457, "upload_time": "2019-03-31T00:59:29", "url": "https://files.pythonhosted.org/packages/18/6b/a9fb914069422e7d310a8566bcabe34c60c88cfbce9a4cd53c1a51fc56dd/aghast-0.1.0rc2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6dbac99d02d7543d220db8a91a5badfc", "sha256": "d6ed8a79546d8c5650b3c19815ae67f375b333a82761f89aaedab84762e0a818" }, "downloads": -1, "filename": "aghast-0.1.0rc2.tar.gz", "has_sig": false, "md5_digest": "6dbac99d02d7543d220db8a91a5badfc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 88640, "upload_time": "2019-03-31T00:59:32", "url": "https://files.pythonhosted.org/packages/da/47/b3d19bc3b1844c003a65d09d86c2a47666eeebcc8068f04c43d61e4919e7/aghast-0.1.0rc2.tar.gz" } ], "0.1.0rc3": [ { "comment_text": "", "digests": { "md5": "62824af12544592f1f80186fa5934d40", "sha256": "3a188f4e635529b61157843ffff3548f9903933f63dfb80b5acc84c86932269f" }, "downloads": -1, "filename": "aghast-0.1.0rc3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "62824af12544592f1f80186fa5934d40", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 132580, "upload_time": "2019-03-31T01:04:45", "url": "https://files.pythonhosted.org/packages/d1/e6/973edd266f966ba6dd2cc5130d416efe46323310e3c7c94a1071696d7732/aghast-0.1.0rc3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c00074f7e081d1a2c9e929d43d6bf874", "sha256": "9128723166c7a56aa39d768b4bf684fd981bbb35639c1c40201487d002aa35c7" }, "downloads": -1, "filename": "aghast-0.1.0rc3.tar.gz", "has_sig": false, "md5_digest": "c00074f7e081d1a2c9e929d43d6bf874", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 90653, "upload_time": "2019-03-31T01:04:47", "url": "https://files.pythonhosted.org/packages/25/91/f2fa7ba03c4bd053c77b21481767a313e794ff0c32693a4063e576fad74a/aghast-0.1.0rc3.tar.gz" } ], "0.1.0rc4": [ { "comment_text": "", "digests": { "md5": "737cac97cbe330654c0f28d0e1b3db15", "sha256": "5c3acd424e7c962e11e6a8248dd52a9f59c74fd38a1bd607bbdd65fa7943fa78" }, "downloads": -1, "filename": "aghast-0.1.0rc4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "737cac97cbe330654c0f28d0e1b3db15", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 132581, "upload_time": "2019-03-31T01:45:01", "url": "https://files.pythonhosted.org/packages/f2/b0/39eee3e9aaa12d67ae30468d69119f9062cb9546f739fe527168887ab711/aghast-0.1.0rc4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "be913c885658fff5c15e0ed6d2b0f204", "sha256": "d5c80ce3bd4b4e4dcd64b4c68f268e62b4e8e6469496e5e0116b3c1f45cf9d09" }, "downloads": -1, "filename": "aghast-0.1.0rc4.tar.gz", "has_sig": false, "md5_digest": "be913c885658fff5c15e0ed6d2b0f204", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 108986, "upload_time": "2019-03-31T01:45:02", "url": "https://files.pythonhosted.org/packages/db/9c/7cb8c71516eb0f4035eaa3e4de629fffdae96b7b674ec878ab5743673441/aghast-0.1.0rc4.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "249467f76f5f7f3251ac0cc9644ddca5", "sha256": "1e2e67d6bc6235338d2a22b7c1eed4430396798b7596c3a3af9c131d7f8b0c49" }, "downloads": -1, "filename": "aghast-0.2.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "249467f76f5f7f3251ac0cc9644ddca5", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 135759, "upload_time": "2019-04-11T20:48:16", "url": "https://files.pythonhosted.org/packages/dd/b6/1bec45e1efa55b6c998143f86da2ec9f43ab07599428b2f61792a647024f/aghast-0.2.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "19269e4a4d7dd73aff582eef44ada411", "sha256": "252afb247e055b4b9e793e7a9c461d48940c24db84faf3577517357c792f4d4d" }, "downloads": -1, "filename": "aghast-0.2.0.tar.gz", "has_sig": false, "md5_digest": "19269e4a4d7dd73aff582eef44ada411", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 137516, "upload_time": "2019-04-11T20:48:18", "url": "https://files.pythonhosted.org/packages/0b/d3/5ee284ba8f8b3fae1f515503dd5efc805a060f238cbc69ceae23cb7a770e/aghast-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "bfca193395dfd928cac5b22e24abb77c", "sha256": "d945d3adb55dea3a1cd465730c49110a8667651d9d2ae1ff6643900b0d3e65c8" }, "downloads": -1, "filename": "aghast-0.2.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "bfca193395dfd928cac5b22e24abb77c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 135967, "upload_time": "2019-04-12T21:42:29", "url": "https://files.pythonhosted.org/packages/9a/15/b67a8f15912dbfbbf4ae6025c4284c287a6be67e580d618afb38701af7a2/aghast-0.2.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "47671b75afe7e3406ebed22ac0aaff1f", "sha256": "5a60f84d8ecb1b5d56368d6eb839fdae93496d9d96b2a1ead8f820732679ccb4" }, "downloads": -1, "filename": "aghast-0.2.1.tar.gz", "has_sig": false, "md5_digest": "47671b75afe7e3406ebed22ac0aaff1f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 137730, "upload_time": "2019-04-12T21:42:31", "url": "https://files.pythonhosted.org/packages/fd/d9/cfbc5921f2fa64648b3aeff0a5f02a7db1287dca0a38e560896a3e805671/aghast-0.2.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "bfca193395dfd928cac5b22e24abb77c", "sha256": "d945d3adb55dea3a1cd465730c49110a8667651d9d2ae1ff6643900b0d3e65c8" }, "downloads": -1, "filename": "aghast-0.2.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "bfca193395dfd928cac5b22e24abb77c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 135967, "upload_time": "2019-04-12T21:42:29", "url": "https://files.pythonhosted.org/packages/9a/15/b67a8f15912dbfbbf4ae6025c4284c287a6be67e580d618afb38701af7a2/aghast-0.2.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "47671b75afe7e3406ebed22ac0aaff1f", "sha256": "5a60f84d8ecb1b5d56368d6eb839fdae93496d9d96b2a1ead8f820732679ccb4" }, "downloads": -1, "filename": "aghast-0.2.1.tar.gz", "has_sig": false, "md5_digest": "47671b75afe7e3406ebed22ac0aaff1f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 137730, "upload_time": "2019-04-12T21:42:31", "url": "https://files.pythonhosted.org/packages/fd/d9/cfbc5921f2fa64648b3aeff0a5f02a7db1287dca0a38e560896a3e805671/aghast-0.2.1.tar.gz" } ] }