{ "info": { "author": "A. Roebel", "author_email": "axel.dot.roebel@ircam.dot.fr", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "License :: OSI Approved :: GNU General Public License (GPL)", "Operating System :: MacOS :: MacOS X", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 3", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "py_find_1st\n===========\n\npy_find_1st is a numpy extension that allows to find the first index\ninto an 1D-array that validates a boolean condition that can consist of\na comparison operator and a limit value.\n\nFunctionality\n-------------\n\nThis extension solves the very frequent problem of finding first indices\nwithout requiring to read the full array.\n\nThe call sequence\n\n::\n\n import numpy as np\n import utils_find_1st as utf1st\n\n limit = 0.\n rr= np.random.randn(100)\n ind = utf1st.find_1st(rr < limit, True, utf1st.cmp_equal)\n\nand more efficiently\n\n::\n\n ind = utf1st.find_1st(rr, limit, utf1st.cmp_smaller)\n\nis equivalent to\n\n::\n\n import numpy as np\n limit = 0.\n rr= np.random.randn(100)\n ind = np.flatnonzero(rr < limit)\n if len(ind) :\n ret = ind[0]\n else:\n ret = -1\n\nImplementation details\n----------------------\n\npy_find_1st is written as a numpy extension making use of a templated\nimplementation of the find_1st function that currently supports\noperating on arrays of dtypes:\n\n::\n\n [np.float64, np.float32, np.int64, np.int32, np.bool]\n\nComparison operators are selected using integer opcodes with the\nfollowing meaning:\n\n::\n\n opcode == utils_find_1st.cmp_smaller -> comp op: <\n opcode == utils_find_1st.cmp_smaller_eq -> comp op: <=\n opcode == utils_find_1st.cmp_equal -> comp op: ==\n opcode == utils_find_1st.cmp_not_equal -> comp op: !=\n opcode == utils_find_1st.cmp_larger -> comp op: <\n opcode == utils_find_1st.cmp_larger_eq -> comp op: <=\n\nPerformance\n-----------\n\nThe runtime difference is strongly depending on the number of true cases\nin the array. If the condition is never valid runtime is the same - both\nimplementations do not produce a valid index and need to compare the\nfull array - but on case that there are matches np.flatnonzero needs to\nrun through the full array and needs to create a result array with size\nthat depends o the number of matches while find_1st only produces a\nscalar result and only needs to compare the array until the first match\nis found.\n\nDepending on the size of the array and the number of matches the speed\ndifference can be very significant (easily > factor 10)\n\ntest\n----\n\nrun test/test_find_1st.py which should display \"all tests passed!\"\n\nBenchmarking\n~~~~~~~~~~~~\n\nWe can easily compare the runtime using the three lines\n\n::\n\n In [6]: timeit ind = np.flatnonzero(rr < limit)[0]\n 1.69 $\\mu$s $\\pm$ 24.5 ns per loop (mean $\\pm$ std. dev. of 7 runs, 1000000 loops each)\n\n In [4]: timeit ind = utf1st.find_1st(rr < limit, True, utf1st.cmp_equal)\n 1.13 $\\mu$s $\\pm$ 18.9 ns per loop (mean $\\pm$ std. dev. of 7 runs, 1000000 loops each)\n\n In [5]: timeit ind = utf1st.find_1st(rr, limit, utf1st.cmp_smaller)\n 270 ns $\\pm$ 5.57 ns per loop (mean $\\pm$ std. dev. of 7 runs, 1000000 loops each)\n\nWhich shows the rather significant improvement obtained by the last\nversion that does not require to perform all comparisons of the 100\nelements. In the above case the second element is tested positive. In\nthe worst case, where no valid element is present all comparisons have\nto be performed and flatnonzero does not need to create a results array,\nand therefore performance should be similar. For the small array sizes\nwe used so far the overhead of np.flanonzero is dominating the costs as\ncan be seen in the following.\n\n::\n\n In [9]: limit = -1000.\n In [10]: timeit ind = np.flatnonzero(rr < limit)\n 1.56 $\\mu$s $\\pm$ 13.8 ns per loop (mean $\\pm$ std. dev. of 7 runs, 1000000 loops each)\n\n In [11]: timeit ind = utf1st.find_1st(rr